UTIA - Library

bibtype

C - Conference Paper (international conference)

ARLID

0507120

utime

20240103222343.2

mtime

20190731235959.9

SCOPUS

85001945953

WOS

000391051600008

DOI

10.1145/2996758.2996761

title (primary) (eng)

Discriminative models for multi-instance problems with tree-structure

specification

page_count	9 s.
media_type	P

serial

ARLID

cav_un_epca*0507119

ISBN

978-1-4503-4573-6

title

Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security (AISec'16)

page_num

83-91

publisher

place	New York
name	ACM
year	2016

keyword

big data

keyword

learning indicators of compromise

keyword

malware detection

keyword

neural network

keyword

user modeling

author (primary)

ARLID	cav_un_auth*0307300
name1	Pevný
name2	T.
country	CZ

author

ARLID	cav_un_auth*0101197
name1	Somol
name2	Petr
full_dept (cz)	Rozpoznávání obrazu
full_dept	Department of Pattern Recognition
department (cz)	RO
department	RO
institution	UTIA-B
full_dept	Department of Pattern Recognition
fullinstit	Ústav teorie informace a automatizace AV ČR, v. v. i.

source

url	http://library.utia.cas.cz/separaty/2019/RO/somol-0507120.pdf

cas_special

abstract (eng)

Modelling network traffic is gaining importance to counter modern security threats of ever increasing sophistication. It is though surprisingly difficult and costly to construct reliable classifiers on top of telemetry data due to the variety and complexity of signals that no human can manage to interpret in full. Obtaining training data with sufficiently large and variable body of labels can thus be seen as a prohibitive problem. The goal of this work is to detect infected computers by observing their HTTP(S) traffic collected from network sensors, which are typically proxy servers or network firewalls, while relying on only minimal human input in the model training phase. We propose a discriminative model that makes decisions based on a computer's all traffic observed during a predefined time window (5 minutes in our case). The model is trained on traffic samples collected over equally-sized time windows for a large number of computers, where the only labels needed are (human) verdicts about the computer as a whole (presumed infected vs. presumed clean). As part of training, the model itself learns discriminative patterns in traffic targeted to individual servers and constructs the final high-level classifier on top of them. We show the classifier to perform with very high precision, and demonstrate that the learned traffic patterns can be interpreted as Indicators of Compromise. We implement the discriminative model as a neural network with special structure reflecting two stacked multi instance problems. The main advantages of the proposed configuration include not only improved accuracy and ability to learn from gross labels, but also automatic learning of server types (together with their detectors) that are typically visited by infected computers.

action

ARLID	cav_un_auth*0377828
name	the 2016 ACM Workshop on Artificial Intelligence and Security (AISec'16)
dates	20161028
mrcbC20-s	20161028
place	Vienna
country	AT

RIV

FORD0

20000

FORD1

20200

FORD2

20205

reportyear

2020

num_of_auth

presentation_type

inst_support

RVO:67985556

permalink

http://hdl.handle.net/11104/0298524

confidential

mrcbC86

n.a. Proceedings Paper Computer Science Artificial Intelligence|Computer Science Theory Methods

arlyear

2016

mrcbU14

85001945953 SCOPUS

mrcbU24

PUBMED

mrcbU34

000391051600008 WOS

mrcbU63

cav_un_epca*0507119 Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security (AISec'16) 978-1-4503-4573-6 83 91 New York ACM 2016