WEB DATA COLLECTION
SPIDA is a set of tools that collect unstructured data from the clear, deep and dark web. SPIDA comes in three configurations that enable investigators to acquire and collate the material in the most appropriate form.
Huntsman is a powerful tool helping gather unstructured data from webpages and documents to insert into i2 Analyst’s Notebook charts. The extracted data maintains a link to the original source throughout the process. Point Duty Huntsman is available as a plug in for IBM i2 Analyst’s Notebook.
Huntsman saves time – extraction straight from a source and importing into a i2 Notebook Chart directly as entities, links, attributes or cards.
Huntsman is efficient – can extract images and text; website data, scripting or capture a screenshot.
Huntsman is discreet – built in TOR for private data extraction from anything on any of the webs.
Huntsman can be used to manually extract text and images. Extracted data can import directly into i2 Analyst’s Notebook as entities, links, attributes or cards.
Huntsman is used to extract data from web pages, forums, bulletin boards and social networks from many source types. Sources of data can come from PDF, Word, HTML or txt files. Huntsman can extract HTML data using the inbuilt TOR Browser, allowing discrete extraction from the clear, deep or dark web. Huntsman captures data from the entire website, collected data is logged and maintained in i2 Cards as a screenshot and as text, images and scripting. All items are linked to original sources - for archival and evidence purposes. An audit trail is created with logs of all extractions created by a user.
For fully automated full site captures based on keywords and URL’s. Wolf unique feature is a heuristic learning engine that enables Wolf to learn the layout of various web site forms such as bulletin boards with their wide variety of layout and conventions for data presentation. Wolf can learn date formats, name conventions, post configurations and reply formats.
Funnelweb delivered complete website download based on key word or URL. Multiple searches and downloads can be run simultaneously. All data is stored and can be exported to IBM i2 Analyst's Notebook charts for analysis. Anonymous searching can be carried out using the built in TOR browser.