workbench.server package

Submodules

workbench.server.data_store module

DataStore class for WorkBench.

class workbench.server.data_store.DataStore(uri='mongodb://localhost/workbench', database='workbench', worker_cap=0, samples_cap=0)[source]

Bases: object

DataStore for Workbench.

Currently tied to MongoDB but making this class ‘abstract’ should be straightforward and we could think about using another backend.

Initialization for the Workbench data store class.

Parameters:
  • uri – Connection String for DataStore backend.
  • database – Name of database.
  • worker_cap – MBs in the capped collection.
  • samples_cap – MBs of sample to be stored.
get_uri()[source]

Return the uri of the data store.

store_sample(filename, sample_bytes, type_tag)[source]

Store a sample into the datastore.

Parameters:
  • filename – Name of the file.
  • sample_bytes – Actual bytes of sample.
  • type_tag – Type of sample (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...).
Returns:

Digest md5 digest of the sample.

sample_storage_size()[source]

Get the storage size of the samples storage collection.

expire_data()[source]

Expire data within the samples collection.

clean_for_serialization(data)[source]

Clean data in preparation for serialization.

Deletes items having key either a BSON, datetime, dict or a list instance, or starting with __.

Parameters:data – Sample data to be serialized.
Returns:Cleaned data dictionary.
clean_for_storage(data)[source]

Clean data in preparation for storage.

Deletes items with key having a ‘.’ or is ‘_id’. Also deletes those items whose value is a dictionary or a list.

Parameters:data – Sample data dictionary to be cleaned.
Returns:Cleaned data dictionary.
get_sample(md5)[source]

Get the sample from the data store.

This method first fetches the data from datastore, then cleans it for serialization and then updates it with ‘raw_bytes’ item.

Parameters:md5 – The md5 digest of the sample to be fetched from datastore.
Returns:The sample dictionary.
Raises:RuntimeError – Either Sample is not found or the gridfs file is missing.
get_sample_window(type_tag, size=10)[source]

Get a window of samples not to exceed size (in MB).

Parameters:
  • type_tag – Type of sample (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...).
  • size – Size of samples in MBs.
Returns:

a list of md5s.

has_sample(md5)[source]

Checks if data store has this sample.

Parameters:md5 – The md5 digest of the required sample.
Returns:True if sample with this md5 is present, else False.
list_samples(predicate={})[source]

List all samples that meet the predicate or all if predicate is not specified.

Parameters:predicate – Match samples against this predicate (or all if not specified)
Returns:List of dictionaries with matching samples {‘md5’:md5, ‘filename’: ‘foo.exe’, ‘type_tag’: ‘exe’}
store_work_results(results, collection, md5)[source]

Store the output results of the worker.

Parameters:
  • results – a dictionary.
  • collection – the database collection to store the results in.
  • md5 – the md5 of sample data to be updated.
get_work_results(collection, md5)[source]

Get the results of the worker.

Parameters:
  • collection – the database collection storing the results.
  • md5 – the md5 digest of the data.
Returns:

Dictionary of the worker result.

all_sample_md5s(type_tag=None)[source]

Return a list of all md5 matching the type_tag (‘exe’,’pdf’, etc).

Parameters:type_tag – the type of sample.
Returns:a list of matching samples.
clear_db()[source]

Drops the entire workbench database.

periodic_ops()[source]

Run periodic operations on the the data store.

Operations like making sure collections are capped and indexes are set up.

to_unicode(s)[source]

Convert an elementary datatype to unicode.

Parameters:s – the datatype to be unicoded.
Returns:Unicoded data.
data_to_unicode(data)[source]

Recursively convert a list or dictionary to unicode.

Parameters:data – The data to be unicoded.
Returns:Unicoded data.

workbench.server.els_indexer module

ELSIndexer class for WorkBench.

class workbench.server.els_indexer.ELSStubIndexer(hosts='[{"host": "localhost", "port": 9200}]')[source]

Bases: object

ELS Stub.

Stub Indexer Initialization.

index_data(data, index_name, doc_type)[source]

Index data in Stub Indexer.

search(index_name, query)[source]

Search in Stub Indexer.

class workbench.server.els_indexer.ELSIndexer(hosts=None)[source]

Bases: object

ELSIndexer class for WorkBench.

Initialization for the Elastic Search Indexer.

Parameters:hosts – List of connection settings.
index_data(data, index_name, doc_type)[source]

Take an arbitrary dictionary of data and index it with ELS.

Parameters:
  • data – data to be Indexed. Should be a dictionary.
  • index_name – Name of the index.
  • doc_type – The type of the document.
Raises:

RuntimeError – When the Indexing fails.

search(index_name, query)[source]

Search the given index_name with the given ELS query.

Parameters:
  • index_name – Name of the Index
  • query – The string to be searched.
Returns:

List of results.

Raises:

RuntimeError – When the search query fails.

workbench.server.neo_db module

NeoDB class for WorkBench.

class workbench.server.neo_db.NeoDBStub(uri='http://localhost:7474/db/data')[source]

Bases: object

NeoDB Stub.

NeoDB Stub.

add_node(node_id, name, labels)[source]

NeoDB Stub.

has_node(node_id)[source]

NeoDB Stub.

add_rel(source_node_id, target_node_id, rel)[source]

NeoDB Stub.

clear_db()[source]

NeoDB Stub.

class workbench.server.neo_db.NeoDB(uri='http://localhost:7474/db/data')[source]

Bases: object

NeoDB indexer for Workbench.

Initialization for NeoDB indexer.

Parameters:uri – The uri to connect NeoDB.
Raises:RuntimeError – When connection to NeoDB failed.
add_node(node_id, name, labels)[source]

Add the node with name and labels.

Parameters:
  • node_id – Id for the node.
  • name – Name for the node.
  • labels – Label for the node.
Raises:

NotImplementedError – When adding labels is not supported.

has_node(node_id)[source]

Checks if the node is present.

Parameters:node_id – Id for the node.
Returns:True if node with node_id is present, else False.
add_rel(source_node_id, target_node_id, rel)[source]

Add a relationship between nodes.

Parameters:
  • source_node_id – Node Id for the source node.
  • target_node_id – Node Id for the target node.
  • rel – Name of the relationship ‘contains’
clear_db()[source]

Clear the Graph Database of all nodes and edges.

workbench.server.plugin_manager module

A simple plugin manager. Rolling my own for three reasons: 1) Environmental scan did not give me quite what I wanted. 2) The super simple examples didn’t support automatic/dynamic loading. 3) I kinda wanted to understand the process :)

workbench.server.plugin_manager.test()[source]

Executes plugin_manager.py test.

workbench.server.workbench module

Workbench: Open Source Security Framework

class workbench.server.workbench.WorkBench(store_args=None, els_hosts=None, neo_uri=None)[source]

Bases: object

Workbench: Open Source Security Framework.

Initialize the Framework.

Parameters:
  • store_args – Dictionary with keys uri,database,samples_cap, worker_cap.
  • els_hosts – The address where Elastic Search Indexer is running.
  • neo_uri – The address where Neo4j is running.
store_sample(filename, input_bytes, type_tag)[source]

Store a sample into the DataStore. :param filename: name of the file (used purely as meta data not for lookup) :param input_bytes: the actual bytes of the sample e.g. f.read() :param type_tag: (‘exe’,’pcap’,’pdf’,’json’,’swf’, or ...)

Returns:the md5 of the sample.
get_sample(md5)[source]

Get a sample from the DataStore. :param md5: the md5 of the sample

Returns:A dictionary of meta data about the sample which includes a [‘raw_bytes’] key that contains the raw bytes.
get_sample_window(type_tag, size)[source]

Get a sample from the DataStore. :param type_tag: the type of samples (‘pcap’,’exe’,’pdf’) :param size: the size of the window in MegaBytes (10 = 10MB)

Returns:A list of md5s representing the newest samples within the size window
has_sample(md5)[source]

Do we have this sample in the DataStore. :param md5: the md5 of the sample

Returns:True or False
list_samples(predicate={})[source]

List all samples that meet the predicate or all if predicate is not specified.

Parameters:predicate – Match samples against this predicate (or all if not specified)
Returns:List of dictionaries with matching samples {‘md5’:md5, ‘filename’: ‘foo.exe’, ‘type_tag’: ‘exe’}
stream_sample = <Mock name='mock.stream()' id='140434435054032'>[source]
guess_type_tag(input_bytes)[source]

Try to guess the type_tag for this sample

index_sample(md5, index_name)[source]

Index a stored sample with the Indexer. :param md5: the md5 of the sample :param index_name: the name of the index

Returns:Nothing
index_worker_output(worker_name, md5, index_name, subfield)[source]

Index worker output with the Indexer. :param worker_name: ‘strings’, ‘pe_features’, whatever :param md5: the md5 of the sample :param index_name: the name of the index :param subfield: index just this subfield (None for all)

Returns:Nothing
search(index_name, query)[source]

Search a particular index in the Indexer :param index_name: the name of the index :param query: the query against the index

Returns:All matches to the query
add_node(node_id, name, labels)[source]

Add a node to the graph with name and labels. :param node_id: the unique node_id e.g. ‘www.evil4u.com’ :param name: the display name of the node e.g. ‘evil4u’ :param labels: a list of labels e.g. [‘domain’,’evil’]

Returns:Nothing
has_node(node_id)[source]

Does the Graph DB have this node :param node_id: the unique node_id e.g. ‘www.evil4u.com’

Returns:True/False
add_rel(source_id, target_id, rel)[source]

Add a relationship: source, target must already exist (see add_node) ‘rel’ is the name of the relationship ‘contains’ or whatever. :param source_id: the unique node_id of the source :param target_id: the unique node_id of the target :param rel: name of the relationship

Returns:Nothing
clear_graph_db()[source]

Clear the Graph Database of all nodes and edges.

Returns:Nothing
clear_db()[source]

Clear the Main Database of all samples and worker output.

Returns:Nothing
work_request(worker_name, md5, subkeys=None)[source]

Make a work request for an existing stored sample. :param worker_name: ‘strings’, ‘pe_features’, whatever :param md5: the md5 of the sample :param subkeys: just return a subfield e.g. ‘foo’ or ‘foo.bar’ (None for all)

Returns:The output of the worker or just the subfield of the worker output
batch_work_request = <Mock name='mock.stream()' id='140434435054032'>[source]
store_sample_set(md5_list)[source]

Store a sample set (which is just a list of md5s).

Note: All md5s must already be in the data store.

Parameters:md5_list – a list of the md5s in this set (all must exist in data store)
Returns:The md5 of the set (the actual md5 of the set
get_sample_set(md5)[source]

Store a sample set (which is just a list of md5s).

Parameters:md5_list – a list of the md5s in this set (all must exist in data store)
Returns:The md5 of the set (the actual md5 of the set
stream_sample_set = <Mock name='mock.stream()' id='140434435054032'>[source]
get_datastore_uri()[source]

Gives you the current datastore URL.

Returns:The URI of the data store currently being used by Workbench
help(cli=False)[source]

Returns help commands

help_basic(cli=False)[source]

Returns basic help commands

help_commands(cli=False)[source]

Returns a big string of Workbench commands and signatures

help_command(command, cli=False)[source]

Returns a specific Workbench command and docstring

help_workers(cli=False)[source]

Returns a big string of the loaded Workbench workers and their dependencies

help_worker(worker, cli=False)[source]

Returns a specific Workbench worker and docstring

help_advanced(cli=False)[source]

Returns advanced help commands

help_everything(cli=False)[source]

Returns advanced help commands

list_all_commands()[source]

Returns a list of all the Workbench commands

list_all_workers()[source]

List all the currently loaded workers

worker_info(worker_name)[source]

Get the information about this worker

test_worker(worker_name)[source]

Run the test for a specific worker

workbench.server.workbench.run()[source]

Run the workbench server

workbench.server.workbench.test()[source]

Module contents