workbench.server package


workbench.server.data_store module

DataStore class for WorkBench.

class workbench.server.data_store.DataStore(uri='mongodb://localhost/workbench', database='workbench', worker_cap=0, samples_cap=0)[source]

Bases: object

DataStore for Workbench.

Currently tied to MongoDB but making this class ‘abstract’ should be straightforward and we could think about using another backend.

Initialization for the Workbench data store class.

  • uri – Connection String for DataStore backend.
  • database – Name of database.
  • worker_cap – MBs in the capped collection.
  • samples_cap – MBs of sample to be stored.

Return the uri of the data store.

store_sample(sample_bytes, filename, type_tag)[source]

Store a sample into the datastore.

  • filename – Name of the file.
  • sample_bytes – Actual bytes of sample.
  • type_tag – Type of sample (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...)

md5 digest of the sample.


Get the storage size of the samples storage collection.


Expire data within the samples collection.


Delete a specific sample


Clean data in preparation for serialization.

Deletes items having key either a BSON, datetime, dict or a list instance, or starting with __.

Parameters:data – Sample data to be serialized.
Returns:Cleaned data dictionary.

Clean data in preparation for storage.

Deletes items with key having a ‘.’ or is ‘_id’. Also deletes those items whose value is a dictionary or a list.

Parameters:data – Sample data dictionary to be cleaned.
Returns:Cleaned data dictionary.
get_full_md5(partial_md5, collection)[source]

Support partial/short md5s, return the full md5 with this method


Get the sample from the data store.

This method first fetches the data from datastore, then cleans it for serialization and then updates it with ‘raw_bytes’ item.

Parameters:md5 – The md5 digest of the sample to be fetched from datastore.
Returns:The sample dictionary or None
get_sample_window(type_tag, size=10)[source]

Get a window of samples not to exceed size (in MB).

  • type_tag – Type of sample (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...).
  • size – Size of samples in MBs.

a list of md5s.


Checks if data store has this sample.

Parameters:md5 – The md5 digest of the required sample.
Returns:True if sample with this md5 is present, else False.

List all samples that match the tags or all if tags are not specified.

Parameters:tags – Match samples against these tags (or all if not specified)
Returns:List of the md5s for the matching samples

List of the tags and md5s for all samples

Returns:List of the tags and md5s for all samples
store_work_results(results, collection, md5)[source]

Store the output results of the worker.

  • results – a dictionary.
  • collection – the database collection to store the results in.
  • md5 – the md5 of sample data to be updated.
get_work_results(collection, md5)[source]

Get the results of the worker.

  • collection – the database collection storing the results.
  • md5 – the md5 digest of the data.

Dictionary of the worker result.


Return a list of all md5 matching the type_tag (‘exe’,’pdf’, etc).

Parameters:type_tag – the type of sample.
Returns:a list of matching samples.

Drops all of the worker output collections


Drops the entire workbench database.


Run periodic operations on the the data store.

Operations like making sure collections are capped and indexes are set up.


Convert an elementary datatype to unicode.

Parameters:s – the datatype to be unicoded.
Returns:Unicoded data.

Recursively convert a list or dictionary to unicode.

Parameters:data – The data to be unicoded.
Returns:Unicoded data.

workbench.server.dir_watcher module

A simple directory watcher Credit: ronedg @

class workbench.server.dir_watcher.DirWatcher(path)[source]

Bases: object

A simple directory watcher

Initialize the Directory Watcher :param path: path of the directory to watch

register_callbacks(on_create, on_modify, on_delete)[source]

Register callbacks for file creation, modification, and deletion


Monitor the path given


Cleanup the DirWatcher instance

workbench.server.els_indexer module

ELSIndexer class for WorkBench.

class workbench.server.els_indexer.ELSStubIndexer(hosts='[{"host": "localhost", "port": 9200}]')[source]

Bases: object

ELS Stub.

Stub Indexer Initialization.

index_data(data, index_name, doc_type)[source]

Index data in Stub Indexer.

search(index_name, query)[source]

Search in Stub Indexer.

class workbench.server.els_indexer.ELSIndexer(hosts=None)[source]

Bases: object

ELSIndexer class for WorkBench.

Initialization for the Elastic Search Indexer.

Parameters:hosts – List of connection settings.
index_data(data, index_name, doc_type)[source]

Take an arbitrary dictionary of data and index it with ELS.

  • data – data to be Indexed. Should be a dictionary.
  • index_name – Name of the index.
  • doc_type – The type of the document.

RuntimeError – When the Indexing fails.

search(index_name, query)[source]

Search the given index_name with the given ELS query.

  • index_name – Name of the Index
  • query – The string to be searched.

List of results.


RuntimeError – When the search query fails.

workbench.server.neo_db module

NeoDB class for WorkBench.

class workbench.server.neo_db.NeoDBStub(uri='http://localhost:7474/db/data')[source]

Bases: object

NeoDB Stub.

NeoDB Stub.

add_node(node_id, name, labels)[source]

NeoDB Stub.


NeoDB Stub.

add_rel(source_node_id, target_node_id, rel)[source]

NeoDB Stub.


NeoDB Stub.

class workbench.server.neo_db.NeoDB(uri='http://localhost:7474/db/data')[source]

Bases: object

NeoDB indexer for Workbench.

Initialization for NeoDB indexer.

Parameters:uri – The uri to connect NeoDB.
Raises:RuntimeError – When connection to NeoDB failed.
add_node(node_id, name, labels)[source]

Add the node with name and labels.

  • node_id – Id for the node.
  • name – Name for the node.
  • labels – Label for the node.

NotImplementedError – When adding labels is not supported.


Checks if the node is present.

Parameters:node_id – Id for the node.
Returns:True if node with node_id is present, else False.
add_rel(source_node_id, target_node_id, rel)[source]

Add a relationship between nodes.

  • source_node_id – Node Id for the source node.
  • target_node_id – Node Id for the target node.
  • rel – Name of the relationship ‘contains’

Clear the Graph Database of all nodes and edges.

workbench.server.plugin_manager module

A simple plugin manager. Rolling my own for three reasons: 1) Environmental scan did not give me quite what I wanted. 2) The super simple examples didn’t support automatic/dynamic loading. 3) I kinda wanted to understand the process :)

class workbench.server.plugin_manager.PluginManager(plugin_callback, plugin_dir='workers')[source]

Bases: object

Plugin Manager for Workbench.

Initialize the Plugin Manager for Workbench.

  • plugin_callback – The callback for plugin. This is called when plugin is added.
  • plugin_dir – The dir where plugin resides.

Load all the plugins in the plugin directory


Watcher callback

Parameters:event – The creation event.

Watcher callback.

Parameters:event – The modification event.

Watcher callback.

Parameters:event – The modification event.

Remvoing a deleted plugin.

Parameters:f – the filepath for the plugin.

Adding and verifying plugin.

Parameters:f – the filepath for the plugin.
Validate the plugin, each plugin must have the following:
  1. The worker class must have an execute method: execute(self, input_data).
  2. The worker class must have a dependencies list (even if it’s empty).
  3. The file must have a top level test() method.
Parameters:handler – the loaded plugin.

Plugin validation.

Every workbench plugin must have top level test method.

Parameters:handler – The loaded plugin.
Returns:None if the test fails or the test function.

Plugin validation

Every workbench plugin must have a dependencies list (even if it’s empty). Every workbench plugin must have an execute method.

Parameters:plugin_class – The loaded plugun class.
Returns:True if dependencies and execute are present, else False.

Executes test.

workbench.server.version module

Workbench Server Version

workbench.server.workbench_server module

Workbench: Open Source Security Framework

class workbench.server.workbench_server.WorkBench(store_args=None, els_hosts=None, neo_uri=None)[source]

Bases: object

Workbench: Open Source Security Framework.

Initialize the Framework.

  • store_args – Dictionary with keys uri,database,samples_cap, worker_cap.
  • els_hosts – The address where Elastic Search Indexer is running.
  • neo_uri – The address where Neo4j is running.
exception DataNotFound[source]

Bases: exceptions.Exception

static message()[source]

Return the version of the Workbench server

WorkBench.store_sample(input_bytes, filename, type_tag)[source]

Store a sample into the DataStore. :param input_bytes: the actual bytes of the sample e.g. :param filename: name of the file (used purely as meta data not for lookup) :param type_tag: (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...)

Returns:the md5 of the sample.

Get a sample from the DataStore. :param md5: the md5 of the sample

Returns:A dictionary of meta data about the sample which includes a [‘raw_bytes’] key that contains the raw bytes.
Raises:Workbench.DataNotFound if the sample is not found.

Does the md5 represent a sample_set? :param md5: the md5 of the sample_set

WorkBench.get_sample_window(type_tag, size)[source]

Get a sample from the DataStore. :param type_tag: the type of samples (‘pcap’,’exe’,’pdf’) :param size: the size of the window in MegaBytes (10 = 10MB)

Returns:A sample_set handle which represents the newest samples within the size window

Do we have this sample in the DataStore. :param md5: the md5 of the sample

Returns:True or False
WorkBench.combine_samples(md5_list, filename, type_tag)[source]

Combine samples together. This may have various use cases the most significant involving a bunch of sample ‘chunks’ got uploaded and now we combine them together

md5_list: The list of md5s to combine, order matters! filename: name of the file (used purely as meta data not for lookup) type_tag: (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...)
the computed md5 of the combined samples

Remove the sample from the data store

WorkBench.stream_sample = <Mock name='' id='140267893939280'>[source]
WorkBench.get_dataframe(md5, compress='lz4')[source]

Return a dataframe from the DataStore. This is just a convenience method that uses get_sample internally.

md5: the md5 of the dataframe compress: compression to use: (defaults to ‘lz4’ but can be set to None)
A msgpack’d Pandas DataFrame
Workbench.DataNotFound if the dataframe is not found.

Try to guess the type_tag for this sample

WorkBench.add_tags(md5, tags)[source]

Add tags to this sample

WorkBench.set_tags(md5, tags)[source]

Set the tags for this sample


Get tags for this sample


Get tags for this sample

WorkBench.index_sample(md5, index_name)[source]

Index a stored sample with the Indexer. :param md5: the md5 of the sample :param index_name: the name of the index

WorkBench.index_worker_output(worker_name, md5, index_name, subfield)[source]

Index worker output with the Indexer. :param worker_name: ‘strings’, ‘pe_features’, whatever :param md5: the md5 of the sample :param index_name: the name of the index :param subfield: index just this subfield (None for all)

WorkBench.search_index(index_name, query)[source]

Search a particular index in the Indexer :param index_name: the name of the index :param query: the query against the index

Returns:All matches to the query
WorkBench.add_node(node_id, name, labels)[source]

Add a node to the graph with name and labels. :param node_id: the unique node_id e.g. ‘’ :param name: the display name of the node e.g. ‘evil4u’ :param labels: a list of labels e.g. [‘domain’,’evil’]


Does the Graph DB have this node :param node_id: the unique node_id e.g. ‘’

WorkBench.add_rel(source_id, target_id, rel)[source]

Add a relationship: source, target must already exist (see add_node) ‘rel’ is the name of the relationship ‘contains’ or whatever. :param source_id: the unique node_id of the source :param target_id: the unique node_id of the target :param rel: name of the relationship


Clear the Graph Database of all nodes and edges.


Clear the Main Database of all samples and worker output.


Drops all of the worker output collections

WorkBench.work_request(worker_name, md5, subkeys=None)[source]

Make a work request for an existing stored sample. :param worker_name: ‘strings’, ‘pe_features’, whatever :param md5: the md5 of the sample (or sample_set!) :param subkeys: just get a subkey of the output: ‘foo’ or ‘’ (None for all)

Returns:The output of the worker.
WorkBench.set_work_request = <Mock name='' id='140267893939280'>[source]

Store a sample set (which is just a list of md5s).

Note: All md5s must already be in the data store.

Parameters:md5_list – a list of the md5s in this set (all must exist in data store)
Returns:The md5 of the set (the actual md5 of the set)

Generate a sample_set that maches the tags or all if tags are not specified.

Parameters:tags – Match samples against this tag list (or all if not specified)
Returns:The sample_set of those samples matching the tags

Retrieve a sample set (which is just a list of md5s).

Parameters:md5 – the md5 of the sample_set (returned with the ‘store_sample_set’ call)
Returns:The list of md5s that comprise the sample_set
WorkBench.stream_sample_set = <Mock name='' id='140267893939280'>[source]

Gives you the current datastore URI.

Returns:The URI of the data store currently being used by Workbench[source]

Returns the formatted, colored help


Returns a list of all the Workbench commands


List all the currently loaded workers


Get the information about this component

WorkBench.store_info(info_dict, component, type_tag)[source]

Store information about a component. The component could be a worker or a commands or a class, or whatever you want, the only thing to be aware of is name collisions.


Run the test for a specific worker[source]

Run the workbench server


Module contents

Workbench Server