If you wish to follow along and perform these activities yourself, please download and install the following tools from their respective locations: Please do not use the models you create in this tutorial in a production environment without sufficient tuning and analysis before making them a part of your security program. The author provides these methods, insights, and recommendations *as is* and makes no claim of warranty. Perform basic analysis of your data, chosen fields for AI evaluation, and understand the practicality for your organization using the methods described.Use RapidMiner Studio and Tensorflow 2.0 + Keras to create and train a model using a pre-processed sample CSV dataset.Pre-process the data provided from US-CERT into an AI solution ready format (Tensorflow in particular).What many tutorials don’t state is that if you’re starting from scratch data pre-processing takes up to 90% of your time when doing projects like these.Īt the end of this hybrid article and tutorial, you should be able to:
![about rapidminer studio about rapidminer studio](https://1xltkxylmzx3z8gd647akcdvov-wpengine.netdna-ssl.com/wp-content/uploads/2018/12/ml-library.png)
![about rapidminer studio about rapidminer studio](https://i.ytimg.com/vi/MF2ujiiqrRE/hqdefault.jpg)
Stay with me and try not to fall asleep during the data pre-processing portion. Note: To use and replicate the pre-processed data and steps we use, prepare to spend 1–2 hours on this page. Throughout the article, I will also point out the applicability and return on investment depending on your existing Information Security program in the enterprise. We will ultimately create models that can be re-used for additional predictions based on security events.
![about rapidminer studio about rapidminer studio](https://comparecamp.com/media/uploads/2019/04/rapidminer-DASH.png)
We will start our journey with the raw data provided by the dataset and provide examples of different pre-processing methods to get it “ready” for the AI solution to ingest. The methods and solutions are designed for non-domain experts particularly cyber security professionals.
#ABOUT RAPIDMINER STUDIO HOW TO#
This technical article will teach you how to pre-process data, create your own neural networks, and train and evaluate models using the US-CERT’s simulated insider threat dataset. An A-Z tutorial of using US-CERT insider threat data in neural network creation and modeling in tensorflow and rapidminer studio for cybersec professionals.