dc.contributor.author | Perona Balda, Iñigo | |
dc.contributor.author | Arbelaiz Gallego, Olatz | |
dc.contributor.author | Gurrutxaga Goikoetxea, Ibai | |
dc.contributor.author | Martín Aramburu, Jose Ignacio | |
dc.contributor.author | Muguerza Rivero, Javier Francisco | |
dc.contributor.author | Pérez de la Fuente, Jesús María | |
dc.date.accessioned | 2017-02-10T10:38:24Z | |
dc.date.available | 2017-02-10T10:38:24Z | |
dc.date.issued | 2017-02-10 | |
dc.identifier.uri | http://hdl.handle.net/10810/20608 | |
dc.description | GureKDDCup datubasea UADI (Unsupervised Anomaly Detection for Intrusion detection system) proiektuaren barnean eraiki da. Proiektu honen helburu nagusia, sistema batean sarkinak (erasoak) detektatuko dituen sailkatzaile bat garatzea izango da, sailkatzaile hau garatzeko gainbegiratu gabeko sailkapeneko teknikak erabiliko direlarik. Proiektu honek duen berezitasunik nagusiena, konexioetan erasoak detektatzeko payload-a (paketeen datu eremua) erabiliko dela da. Konexioen sailkapena burutzeko payload-a erabiltzea oraindik sakon aztertu gabe dagoen arloa da baina badirudi R2L (Remote to Local, baliabide bat erabiltzeko eskubiderik izan gabe berau atzitzea du helburu) eta U2R (User to Root, erabiltzaile arrunt batek super-erabiltzaile edo administratzaile eskubideak lortzea du helburu) motako erasoak antzemateko ezinbestekoa dela..
Sailkapen prozesuan konexio kopuru izugarriarekin egin beharko dugu lan eta honek ezinbestean Data Mining munduan murgiltzea dakar. Sailkatzailea ikasteko prozesua automatikoa izatea nahiko dugu eta hortik Machine Learning (ikasketa automatikoa) arloak eskaintzen dizkigun teknikak erabiliko ditugu.
Baina lehenik, beharrezkoa zaigu datubase egoki bat eraikitzea beraren gainean estrategia ezberdinak gainean ikertzeko. Beraz, txosten honen helburua, UADI proiektuak erabiliko duen datu-basea sortzeko jarraitutako prozesua azaltzea izango da. Datu-base hori lortzeko abiapuntua Darpa98 da eta helburua, ingurune zientifikoan erabiltzen den KDDCup datu-basearen antzeko ezaugarriak dituen beste bat sortzea da. Sortuko den datu-basearen (gurekddcup) ezaugarriak, KDDCup99 datu-basearenaren antzekoak izango dira, baina payload-ari dagokion informazioa eta konexioaren hainbat ezaugarri (IP helbideak, portu zenbakiak,...) gehiturik. Beraz jarraian, KDDCup99 sortzeko jarraitu ziren pausuak azalduko dira, ondoren gutxi gora behera antzeko pausuak jarraitu beharko baita gureKddcup, KDDCup99-ren hedapen berria sortzeko (kddcup99+payload), hau da, guk behar dugun datu-basea sortzeko. | es |
dc.description.abstract | The database gureKDDCup has been generated within the UADI project (Unsupervised Anomaly Detection for Intrusion detection system) in which a classifier that detects intrusions or attacks in network based systems was developed. To develop this classifier we are going to use unsupervised classification techniques. The main distinctive feature of this project is that it uses the payload (body part of network packages) to detect attacks in network connections. The analysis of the payload to classify the connections is not a deeply analysed field, however, it seems that it is essential to detect attacks such as R2L (Remote to Local, its goal is to use resources without permission) and U2R (User to Root, its goal is to get root or administrative privileges without having them).
In the classification process we have to handle with a huge amount of connections and discover useful patterns among them. Therefore, this leads us to the Data Mining field. Moreover, we want our UADI system to be able to discover patterns or generate the model of network traffic automatically, that is, we want the learning process to be automatic, and to do it possible, we are going to use Machine Learning techniques.
But first it is essential to generate the apropriate database to work upon it. So the aim of this report is to explain the process we have followed to generate the database we used in the UADI project. The objective is to generate a
database with similar characteristics to KDDCup99 which is broadly used database in the scientific environment, taking as starting point the Darpa98 (DARPA Intrusion Detection Data Sets). The generated database is called gureKDDCup and it has similar features to the ones in KDDCup99, but we added to it payload information and other features related to the connection such as IP address and port numbers. Next lines explains the steps followed to generate the KDDCup99 database because our aim is to repeat those steps as accurately as possible, to create KDDCup99 the database we need in UADI project, in other words, a new extension of the (KDDCup99+payload) that we called it gureKDDCup. | es |
dc.description.sponsorship | The University of the Basque Country UPV/EHU (BAILab, grant UFI11/45);
The Department of Education, Universities and Research of the Basque Government (grant IT-395-10);
The Ministry of Economy and Competitiveness of the Spanish Government and by the European Regional Development Fund - ERDF (eGovernAbility, grant TIN2014-52665-C2-1-R); | es |
dc.language.iso | eng | es |
dc.relation.ispartofseries | EHU-KAT-IK-02-16; | |
dc.rights | info:eu-repo/semantics/openAccess | es |
dc.rights.uri | http://creativecommons.org/licenses/by-sa/4.0/ | * |
dc.subject | KDDCup99 database | es |
dc.subject | network Intrusion Detection System (nIDS) | es |
dc.subject | non-flood attacks | es |
dc.subject | flood attacks | es |
dc.subject | payload analysis | es |
dc.subject | bro-ids | es |
dc.subject | attribute extraction | es |
dc.subject | data mining | es |
dc.subject | machine learning | es |
dc.subject | clustering | es |
dc.subject | anomaly detection | es |
dc.title | Generation of the database gurekddcup | es |
dc.type | info:eu-repo/semantics/report | es |
dc.rights.holder | Attribution-ShareAlike 4.0 International | * |
dc.departamentoes | Arquitectura y Tecnología de Computadores | es_ES |
dc.departamentoeu | Konputagailuen Arkitektura eta Teknologia | es_ES |