file¶
Log network data.
The file module provides methods and objects designed to
simplify writing and reading network traffic logs.
The main objects responsible for logging network data are:
WriteFilefor formatting and writing network data from a single connection to log file(s).ReadFilefor reading data from log file(s) representing a single network connection.LogConnectionlogs data from a single connection to log file(s).
These objects can write/read arbitrary data to/from a log file(s). The network
connection can be specified by either a Connection or MCL
Message object. The following objects can only write/read
Message objects to/from a log file(s):
LogNetworklogs data from multiple connections to a directory of log files.ReadDirectoryfor reading data from a directory of log files representing multiple network connections.
Example: Log raw data¶
The following code illustrates writing (LogConnection) and reading
(ReadFile) raw data to/from log files:
import time
from mcl import ReadFile
from mcl import LogConnection
from mcl import RawBroadcaster
from mcl.network.udp import Connection
# Path (prefix) to log file.
prefix = os.path.join(EXAMPLE_PATH, 'example')
# Create UDP connection.
connection = Connection('ff15::c73d:ce41:ea8b:c0a0')
# Log raw data transmissions.
logger = LogConnection(prefix, connection)
# Create raw broadcaster from IPv6 connection and broadcast data.
broadcaster = RawBroadcaster(connection)
broadcaster.publish('hello world')
time.sleep(0.1)
# Close broadcaster and stop logger.
broadcaster.close()
logger.close()
# Ensure that the log file exists.
log_file = os.path.join(EXAMPLE_PATH, 'example.log')
print os.path.exists(log_file)
# Read contents of log file.
rf = ReadFile(log_file)
print rf.read()['payload']
Example: Log message data¶
The following code illustrates writing (LogConnection) and reading
(ReadFile) a single message type to/from log files. Note that the
example is largely the same as the previous example:
import time
from mcl import Message
from mcl import ReadFile
from mcl import LogConnection
from mcl import MessageBroadcaster
from mcl.network.udp import Connection
# Path (prefix) to log file.
prefix = os.path.join(EXAMPLE_PATH, 'example')
# Create MCL message.
class ExampleMessage(Message):
mandatory = ('data',)
connection = Connection('ff15::c43d:ce41:ea7b:c1b0')
# Log raw data transmissions.
logger = LogConnection(prefix, ExampleMessage)
# Create raw broadcaster from IPv6 connection and broadcast data.
broadcaster = MessageBroadcaster(ExampleMessage)
broadcaster.publish(ExampleMessage(data='hello world'))
time.sleep(0.1)
# Close broadcaster and stop logger.
broadcaster.close()
logger.close()
# Ensure that the log file exists.
log_file = os.path.join(EXAMPLE_PATH, 'example.log')
print os.path.exists(log_file)
# Read contents of log file as an unformatted dictionary.
rf = ReadFile(log_file)
msg = rf.read()['payload']
print type(msg)
print msg
# Read contents of log file as an ExampleMessage.
rf = ReadFile(log_file, message=True)
print type(rf.read()['payload'])
Example: Log network data¶
The following code illustrates writing (LogNetwork) and reading
(ReadDirectory) multiple network message types to/from log
files.
import time
from mcl import Message
from mcl import LogNetwork
from mcl import ReadDirectory
from mcl import MessageBroadcaster
from mcl.network.udp import Connection
# Create MCL messages.
class ExampleMessageA(Message):
mandatory = ('string',)
connection = Connection('ff15::c43d:ce41:ae5b:d1b0')
class ExampleMessageB(Message):
mandatory = ('number',)
connection = Connection('ff15::c43d:ce41:ae5b:d1b1')
# Log network traffic.
messages = [ExampleMessageA, ExampleMessageB]
logger = LogNetwork(EXAMPLE_PATH, messages)
logger.open()
log_path = logger.directory
# Create raw broadcaster from IPv6 connection and broadcast data.
broadcaster_A = MessageBroadcaster(ExampleMessageA)
broadcaster_B = MessageBroadcaster(ExampleMessageB)
broadcaster_A.publish(ExampleMessageA(string='one')); time.sleep(0.1)
broadcaster_A.publish(ExampleMessageA(string='two')); time.sleep(0.1)
broadcaster_B.publish(ExampleMessageB(number=1)); time.sleep(0.1)
broadcaster_B.publish(ExampleMessageB(number=2)); time.sleep(0.1)
# Close broadcasters and stop logger.
broadcaster_A.close()
broadcaster_B.close()
logger.close()
# Ensure that the log directory exists.
print os.path.exists(log_path)
# Read contents of log file as an unformatted dictionary. Note that each
# message type has been recorded in a separate .log file.
rf = ReadDirectory(log_path)
for i in range(4):
msg = rf.read()['payload']
print type(msg), msg
# Like ReadFile(), ReadDirectory() can return the logged data as MCL
# messages.
rf = ReadDirectory(log_path, message=True)
for i in range(4):
msg = rf.read()['payload']
print type(msg), msg
Functions
-
retrieve_git_hash(repository_path)[source]¶ Retrieve git hash from repository.
Parameters: repository_path (str) – Path to git repository (.git) Returns: Current hash of git repository. If the git hash coult not be retrieved, Noneis returned.Return type: str Raises: IOError– If the repository path does not exist.
Classes
-
class
LogConnection(prefix, connection, revision=None, time_origin=None, max_entries=None, max_time=None, open_init=True)[source]¶ Open a connection and record data to file.
Parameters: - prefix (str) – Prefix used for log file(s). The extension is excluded
and is handled by
WriteFile(to facilitate split logs). For example the prefix ‘./data/TestMessage’ will log data to the file ‘./data/TestMessage.log’ and will log data to the files ‘./data/TestMessage_<NNN>.log’ for split log files (where NNN is incremented for each new split log). - connection (
Connection) – MCLMessageobject to record to log file(s). - revision (str) – Revision of code used to generate logs. For instance,
the hash identifying a commit in a Git repository, can be used to
record what version of code was used during logging. The function
retrieve_git_hash()can be used for this purpose. If revision is set toNone(default), no revision will be recorded in the log header. - time_origin (datetime.datetime) – Time origin used to calculate elapsed
time during logging (time data was received - time origin). This
option allows the time origin to be synchronised across multiple
log files. If set to
None, the time origin will be set to the time the first logged message was received. This results in the first logged item having an elapsed time of zero. - max_entries (int) – Maximum number of entries to record per log file. If
set, a new log file will be created once the maximum number of
entries has been recorded. Files follow the naming scheme
‘<prefix>_<NNN>.log’ where NNN is incremented for each new log
file. If set to
Noneall data will be logged to a single file called ‘<prefix>.log’. This option can be used in combination with max_time. - max_time (int) – Maximum length of time, in seconds, to log data. If
set, a new log file will be created after the maximum length of
time has elapsed. Files follow the naming scheme
‘<prefix>_<NNN>.log’ where NNN is incremented for each new log
file. If set to
Noneall data will be logged to a single file called ‘<prefix>.log’. This option can be used in combination with max_entries. - open_init (bool) – If set to
True, open connection immediately after initialisation (default). If set toFalseonly open connection and log data whenopen()is called.
-
max_entries¶ int
Maximum number of entries to record per log file before splitting.
-
max_time¶ int
Maximum length of time, in seconds, to log data before splitting.
-
close()[source]¶ Stop logging connection data.
Returns: Returns Trueif the connection logger was closed. If the connection logger was already closed, the request is ignored and the method returnsFalse.Return type: bool
- prefix (str) – Prefix used for log file(s). The extension is excluded
and is handled by
-
class
LogNetwork(directory, messages, revision=None, max_entries=None, max_time=None, open_init=True)[source]¶ Dump network traffic to files.
The
LogNetworkobject records network traffic to multiple log files. The input directory specifies the location to create a directory, using the following format:<year><month><day>T<hours><minutes><seconds>_<hostname>
for logging network traffic. The input messages specifies a list of MCL
Messageobjects to record. A log file is created for each message specified in the input messages. For instance if message specifies a configuration for receivingMessageAandMessageBobjects, the following directory tree will be created (almost midnight on December 31st 1999):directory/19991231T235959_host/ |-MessageA.log |-MessageB.logIf split logging has been enabled (by the number of entries, elapsed time or both) the log files will be appended with an incrementing counter:
directory/19991231T235959_host/ |-MessageA_000.log |-MessageA_001.log |-MessageB_000.log |-MessageB_001.log |-MessageB_002.log |-MessageB_003.logParameters: - directory (str) – Path to record a directory of network traffic.
- messages (list) – List of
Messageobjects specifying the network traffic to be logged. - revision (str) – Revision of code used to generate logs. For instance,
the hash identifying a commit in a Git repository, can be used to
record what version of code was used during logging. The function
retrieve_git_hash()can be used for this purpose. If revision is set toNone(default), no revision will be recorded in the log header. - max_entries (int) – Maximum number of entries to record per log file. If
set, a new log file will be created once the maximum number of
entries has been recorded. If set to
Noneall data will be logged to a single file. This option can be used in combination with max_time. - max_time (int) – Maximum length of time, in seconds, to log data. If
set, a new log file will be created after the maximum length of
time has elapsed. If set to
Noneall data will be logged to a single file. This option can be used in combination with max_entries. - open_init (bool) – If set to
True, open connection immediately after initialisation (default). If set toFalseonly open connection and log data whenopen()is called.
-
root_directory¶ str
Location where new log directories are created. This path returns the input specified by the optional directory argument.
-
directory¶ str
String specifying the directory where data is being recorded. This attribute is set to none
Noneif the data is NOT being logged to file (stopped state). If the logger is recording data, this attribute is returned as a full path to a newly created directory in the specified directory input using the following the format:<year><month><day>T<hours><minutes><seconds>_<hostname>
-
max_entries¶ int
Maximum number of entries to record per log file. If set to
Noneall data will be logged to a single file.
-
max_time¶ int
Maximum length of time, in seconds, to log data. If set to
Noneall data will be logged to a single file.
Raises: IOError– If the log directory does not exist.TypeError– If the any of the inputs are an incorrect type.
-
class
ReadDirectory(source, min_time=None, max_time=None, message=False, ignore_raw=True)[source]¶ Read data from multiple log files in time order.
The
ReadDirectoryobject reads data from multiple network dump log files in a common directory. The directory may contain single or split log files (seeWriteFileandReadFile).Note
ReadDirectoryassumes the log files have been created byWriteFileand searches for files with the.logextension in the specified directory.ReadDirectorycan operate on directories which contain non.logfiles. Renaming.logfiles or including.logfiles which were not formatted byWriteFileis likely to cause an error inReadDirectory.Parameters: - source (str) – Path to directory containing log files.
- min_time (float) – Minimum time to extract from log file in seconds.
- max_time (float) – Maximum time to extract from log file in seconds.
- message (bool) – If set to
False(default), the logged data is returned ‘raw’. If set toTruelogged data will automatically be decoded into the MCL message type stored in the log file header. Note: to read data as MCL messages, the messages must be loaded into the namespace. - ignore_raw (bool) – If set to
True(default), any raw log files in the path source will be ignored. If set toFalsean exception will be raised if any raw logs are encountered.
-
min_time¶ float
Minimum time to extract from log file in seconds.
-
max_time¶ float
Maximum time to extract from log file in seconds.
Raises: TypeError– If the any of the inputs are an incorrect type.IOError– If the log file/directory does not exist.ValueError– If the minimum time is greater than the maximum time.
-
is_data_pending()[source]¶ Return whether data is available for reading.
Returns: Returns Trueif more data is available. If all data has been read from the log file(s),Falseis returned.Return type: bool
-
read()[source]¶ Read data from the log files.
Read a line of data from the log files. The data is parsed into a dictionary containing the following fields:
{'elapsed_time: <float>, 'topic': <string>, 'payload': <dict or :class:`.Message`>}where:
elapsed_timeis the time elapsed between creating the log file and recording the network data.topicis the topic associated with the network data during the broadcast.payload: is the network data, delivered as a dictionary or MCLMessageobject.
If all network data has been read from the log files (directory), None is returned.
Returns: A dictionary containing, the time elapsed when the line of text was recorded. The topic associated with the message broadcast and a populated MCL message object. Return type: dict Raises: IOError– If an error was encountered during reading.
-
class
ReadFile(filename, min_time=None, max_time=None, message=False)[source]¶ Read data from a log file.
The
ReadFileobject reads data from network dump log files (seeWriteFile). If the data has been logged to a single file,ReadFilecan read the data directly from the file:rf = ReadFile('logs/TestMessage.log')
If the log files have been split,
ReadFilecan read from the first split to the last split (in the directory) by specifying the prefix of the logs:rf = ReadFile('logs/TestMessage')
A portion of a split log file can be read by specifying the path to the specific portion:
rf = ReadFile('logs/TestMessage_002.log')
Note that if a portion of a split log file is read using
ReadFile, header information will not be available. Header information is only recoreded in the first portion.Parameters: - filename (str) – Prefix/Path to log file. If a prefix is given,
ReadFilewill assume the log files have been split into numbered chunks. For example, if ‘data/TestMessage’ is specified,ReadFilewill read all ‘data/TestMessage_*.log’ files in sequence. If the path to a log file is fully specified,ReadFilewill only read the contents of that file (e.g. ‘data/TestMessage_000.log’). - min_time (float) – Minimum time to extract from log file.
- max_time (float) – Maximum time to extract from log file.
- message (bool or str or
Message) – If set toFalse(default), the logged data is returned ‘raw’. If set toTruelogged data will automatically be decoded into the MCL message type stored in the log file header. To force the reader to unpack logged data as a specific MCL message type, set this argument to the requiredMessagetype or to the string name of the required message type. This option can be useful for reading unnamed messages or debugging log files. Use with caution. Note: to read data as MCL messages, the messages must be loaded into the namespace.
-
header¶ dict
Contents of the log file header. If the log file header is not available
Noneis returned, otherwise the following dictionary is returned:dct = {'text': string, 'end': int, 'version': string, 'revision': string, 'created': string, 'type': :data:`.None` or :class:`.Message`}- where:
- <text> is the header text
- <end> Pointer to the end of the header
- <version> Version used to record log files
- <revision> Git hash of version used to log data
- <created> Time when log file was created
- <message> is the type, recorded in the header, used to
represent the logged data (either
NoneorMessage)
-
min_time¶ float
Minimum time to extract from log file.
-
max_time¶ float
Maximum time to extract from log file.
Raises: TypeError– If the any of the inputs are an incorrect type.IOError– If the log file/directory does not exist.ValueError– If the minimum time is greater than the maximum time.
-
is_data_pending()[source]¶ Return whether data is available for reading.
Returns: Returns Trueif more data is available. If all data has been read from the log file(s),Falseis returned.Return type: bool
-
read()[source]¶ Read data from the log file(s).
Read one line of data from the log file(s). The data is parsed into a dictionary containing the following fields:
dct = {'elapsed_time: <float>, 'topic': <string>, 'payload': dict or <:class:`.Message` object>}where:
elapsed_timeis the time elapsed between creating the log file and recording the network data.topicis the topic associated with the network data during the broadcast.payload: is the network data, delivered as a dictionary or MCLMessageobject.
If all data has been read from the log file, None is returned.
Returns: A dictionary containing, the time elapsed when the line of text was recorded. The topic associated with the message broadcast and a populated MCL message object. Return type: dict Raises: IOError– If an error was encountered during reading.
- filename (str) – Prefix/Path to log file. If a prefix is given,
-
class
WriteFile(prefix, connection, revision=None, time_origin=None, max_entries=None, max_time=None)[source]¶ Write network messages to log file(s).
The
WriteFileobject is used for writing network messages to log file(s). To log data to a single file, use:wf = WriteFile(fname, Message)
WriteFilecan be configures to split the log files by number of entries or time. To configureWriteFileto split log files according to the number of entries, instantiate the object using:wf = WriteFile(fname, Message, max_entries=10)
in the above example, each log file will accumulate 10 entries before closing and starting a new log file. To configure
WriteFileto split log files according to time, instantiate the object using:wf = WriteFile(fname, Message, max_time=60)
in the above example, each log file will accumulate data for 60 seconds before closing and starting a new log file. For example:
wf = WriteFile(fname, Message, max_entries=10, max_time=60)
will accumulate a maximum of 10 entries for a maximum of 60 seconds before closing and starting a new log file. The first condition to be breached will cause a new log file to be created.
Parameters: - prefix (str) – Prefix used for log file(s). The extension is excluded
and is handled by
WriteFile(to facilitate split logs). For example the prefix ‘./data/TestMessage’ will log data to the file ‘./data/TestMessage.log’ and will log data to the files ‘./data/TestMessage_<NNN>.log’ for split log files (where NNN is incremented for each new split log). - connection (
ConnectionorMessage) – an instance of a MCL connection object or a reference to a MCL message type to record to log file(s). - revision (str) – Revision of code used to generate logs. For instance,
the hash identifying a commit in a Git repository, can be used to
record what version of code was used during logging. The function
retrieve_git_hash()can be used for this purpose. If revision is set toNone(default), no revision will be recorded in the log header. - time_origin (datetime.datetime) – UTC time origin used to calculate
elapsed time during logging (time data was received - time
origin). This option allows the time origin to be synchronised
across multiple log files. If set to
None, the time origin will be set to the time the first logged message was received. This results in the first logged item having an elapsed time of zero. - max_entries (int) – Maximum number of entries to record per log file. If
set, a new log file will be created once the maximum number of
entries has been recorded. Files follow the naming scheme
‘<prefix>_<NNN>.log’ where NNN is incremented for each new log
file. If set to
Noneall data will be logged to a single file called ‘<prefix>.log’. This option can be used in combination with max_time. - max_time (int) – Maximum length of time, in seconds, to log data. If
set, a new log file will be created after the maximum length of
time has elapsed. Files follow the naming scheme
‘<prefix>_<NNN>.log’ where NNN is incremented for each new log
file. If set to
Noneall data will be logged to a single file called ‘<prefix>.log’. This option can be used in combination with max_entries.
-
max_entries¶ int
Maximum number of entries to record per log file before splitting.
-
max_time¶ int
Maximum length of time, in seconds, to log data before splitting.
Raises: IOError– If the write directory does not exist.ValueError– If any of the inputs are improperly specified.
-
close()[source]¶ Close log files.
The
WriteFile.close()method finalises the logging process by changing the extension of the log file from ‘.tmp’ to ‘.log’. IfWriteFile.close()is NOT called, no data will be lost, however the log file will not be given the ‘.log’ extension.
-
write(message)[source]¶ Write network data to a file.
The
WriteFile.write()method writes network data to a log file.WriteFile.write()expects network data to be input as a dictionary with the following fields:message = {'topic': str(), 'payload': object(), 'time_received': datetime}
where:
topicis the topic associated with the network data during the broadcast.payload: is the network data to be recorded to file.time_receivedis adatetime.datetimeobject used to record the time the network data was received.
Parameters: message (dict) – Network data to be recorded. The network data must be stored as a dictionary with the time the data was received, the topic associated with the broadcast and the message payload.
- prefix (str) – Prefix used for log file(s). The extension is excluded
and is handled by