DataStore class for WorkBench.
Bases: object
DataStore for Workbench.
Currently tied to MongoDB but making this class ‘abstract’ should be straightforward and we could think about using another backend.
Initialization for the Workbench data store class.
Parameters: |
|
---|
Store a sample into the datastore.
Parameters: |
|
---|---|
Returns: | md5 digest of the sample. |
Clean data in preparation for serialization.
Deletes items having key either a BSON, datetime, dict or a list instance, or starting with __.
Parameters: | data – Sample data to be serialized. |
---|---|
Returns: | Cleaned data dictionary. |
Clean data in preparation for storage.
Deletes items with key having a ‘.’ or is ‘_id’. Also deletes those items whose value is a dictionary or a list.
Parameters: | data – Sample data dictionary to be cleaned. |
---|---|
Returns: | Cleaned data dictionary. |
Support partial/short md5s, return the full md5 with this method
Get the sample from the data store.
This method first fetches the data from datastore, then cleans it for serialization and then updates it with ‘raw_bytes’ item.
Parameters: | md5 – The md5 digest of the sample to be fetched from datastore. |
---|---|
Returns: | The sample dictionary or None |
Get a window of samples not to exceed size (in MB).
Parameters: |
|
---|---|
Returns: | a list of md5s. |
Checks if data store has this sample.
Parameters: | md5 – The md5 digest of the required sample. |
---|---|
Returns: | True if sample with this md5 is present, else False. |
List all samples that match the tags or all if tags are not specified.
Parameters: | tags – Match samples against these tags (or all if not specified) |
---|---|
Returns: | List of the md5s for the matching samples |
List of the tags and md5s for all samples
Returns: | List of the tags and md5s for all samples |
---|
Store the output results of the worker.
Parameters: |
|
---|
Get the results of the worker.
Parameters: |
|
---|---|
Returns: | Dictionary of the worker result. |
Return a list of all md5 matching the type_tag (‘exe’,’pdf’, etc).
Parameters: | type_tag – the type of sample. |
---|---|
Returns: | a list of matching samples. |
Run periodic operations on the the data store.
Operations like making sure collections are capped and indexes are set up.
A simple directory watcher Credit: ronedg @ http://stackoverflow.com/questions/182197/how-do-i-watch-a-file-for-changes-using-python
Bases: object
A simple directory watcher
Initialize the Directory Watcher :param path: path of the directory to watch
ELSIndexer class for WorkBench.
Bases: object
ELS Stub.
Stub Indexer Initialization.
Bases: object
ELSIndexer class for WorkBench.
Initialization for the Elastic Search Indexer.
Parameters: | hosts – List of connection settings. |
---|
NeoDB class for WorkBench.
Bases: object
NeoDB Stub.
NeoDB Stub.
Bases: object
NeoDB indexer for Workbench.
Initialization for NeoDB indexer.
Parameters: | uri – The uri to connect NeoDB. |
---|---|
Raises: | RuntimeError – When connection to NeoDB failed. |
Add the node with name and labels.
Parameters: |
|
---|---|
Raises: | NotImplementedError – When adding labels is not supported. |
Checks if the node is present.
Parameters: | node_id – Id for the node. |
---|---|
Returns: | True if node with node_id is present, else False. |
A simple plugin manager. Rolling my own for three reasons: 1) Environmental scan did not give me quite what I wanted. 2) The super simple examples didn’t support automatic/dynamic loading. 3) I kinda wanted to understand the process :)
Bases: object
Plugin Manager for Workbench.
Initialize the Plugin Manager for Workbench.
Parameters: |
|
---|
Parameters: | handler – the loaded plugin. |
---|
Plugin validation.
Every workbench plugin must have top level test method.
Parameters: | handler – The loaded plugin. |
---|---|
Returns: | None if the test fails or the test function. |
Plugin validation
Every workbench plugin must have a dependencies list (even if it’s empty). Every workbench plugin must have an execute method.
Parameters: | plugin_class – The loaded plugun class. |
---|---|
Returns: | True if dependencies and execute are present, else False. |
Workbench Server Version
Workbench: Open Source Security Framework
Bases: object
Workbench: Open Source Security Framework.
Initialize the Framework.
Parameters: |
|
---|
Store a sample into the DataStore. :param input_bytes: the actual bytes of the sample e.g. f.read() :param filename: name of the file (used purely as meta data not for lookup) :param type_tag: (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...)
Returns: | the md5 of the sample. |
---|
Get a sample from the DataStore. :param md5: the md5 of the sample
Returns: | A dictionary of meta data about the sample which includes a [‘raw_bytes’] key that contains the raw bytes. |
---|---|
Raises: | Workbench.DataNotFound if the sample is not found. – |
Does the md5 represent a sample_set? :param md5: the md5 of the sample_set
Returns: | True/False |
---|
Get a sample from the DataStore. :param type_tag: the type of samples (‘pcap’,’exe’,’pdf’) :param size: the size of the window in MegaBytes (10 = 10MB)
Returns: | A sample_set handle which represents the newest samples within the size window |
---|
Do we have this sample in the DataStore. :param md5: the md5 of the sample
Returns: | True or False |
---|
Combine samples together. This may have various use cases the most significant involving a bunch of sample ‘chunks’ got uploaded and now we combine them together
- Args:
- md5_list: The list of md5s to combine, order matters! filename: name of the file (used purely as meta data not for lookup) type_tag: (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...)
- Returns:
- the computed md5 of the combined samples
Return a dataframe from the DataStore. This is just a convenience method that uses get_sample internally.
- Args:
- md5: the md5 of the dataframe compress: compression to use: (defaults to ‘lz4’ but can be set to None)
- Returns:
- A msgpack’d Pandas DataFrame
- Raises:
- Workbench.DataNotFound if the dataframe is not found.
Add tags to this sample
Set the tags for this sample
Get tags for this sample
Get tags for this sample
Index a stored sample with the Indexer. :param md5: the md5 of the sample :param index_name: the name of the index
Returns: | Nothing |
---|
Index worker output with the Indexer. :param worker_name: ‘strings’, ‘pe_features’, whatever :param md5: the md5 of the sample :param index_name: the name of the index :param subfield: index just this subfield (None for all)
Returns: | Nothing |
---|
Search a particular index in the Indexer :param index_name: the name of the index :param query: the query against the index
Returns: | All matches to the query |
---|
Add a node to the graph with name and labels. :param node_id: the unique node_id e.g. ‘www.evil4u.com’ :param name: the display name of the node e.g. ‘evil4u’ :param labels: a list of labels e.g. [‘domain’,’evil’]
Returns: | Nothing |
---|
Does the Graph DB have this node :param node_id: the unique node_id e.g. ‘www.evil4u.com’
Returns: | True/False |
---|
Add a relationship: source, target must already exist (see add_node) ‘rel’ is the name of the relationship ‘contains’ or whatever. :param source_id: the unique node_id of the source :param target_id: the unique node_id of the target :param rel: name of the relationship
Returns: | Nothing |
---|
Clear the Graph Database of all nodes and edges.
Returns: | Nothing |
---|
Clear the Main Database of all samples and worker output.
Returns: | Nothing |
---|
Make a work request for an existing stored sample. :param worker_name: ‘strings’, ‘pe_features’, whatever :param md5: the md5 of the sample (or sample_set!) :param subkeys: just get a subkey of the output: ‘foo’ or ‘foo.bar’ (None for all)
Returns: | The output of the worker. |
---|
Store a sample set (which is just a list of md5s).
Note: All md5s must already be in the data store.
Parameters: | md5_list – a list of the md5s in this set (all must exist in data store) |
---|---|
Returns: | The md5 of the set (the actual md5 of the set) |
Generate a sample_set that maches the tags or all if tags are not specified.
Parameters: | tags – Match samples against this tag list (or all if not specified) |
---|---|
Returns: | The sample_set of those samples matching the tags |
Retrieve a sample set (which is just a list of md5s).
Parameters: | md5 – the md5 of the sample_set (returned with the ‘store_sample_set’ call) |
---|---|
Returns: | The list of md5s that comprise the sample_set |
Gives you the current datastore URI.
Returns: | The URI of the data store currently being used by Workbench |
---|
Workbench Server