Validation de clustering des donneés dans un contexte big data

dc.contributor.authorMedfouni, Hayet
dc.contributor.authorKhantoul, Bilel
dc.date.accessioned2018-12-05T11:06:11Z
dc.date.available2018-12-05T11:06:11Z
dc.date.issued2018
dc.description.abstractFor more than five (05) decades, computing has become the heart of our businesses, our hospitals, our ministries, our homes.....Etc. This strong use of computing has generated large volumes of data that are not manageable by conventional software and hardware. Take the case of human-sized companies like Google and Microsoft, these major subsidiaries that must have billions of data to keep. This perplexity in the management of these large volumes of data gave birth to Big Data. The quantities of potentially infinite data and the constraints that derive from it pose many problems of treatment. Among these constraints include the impossibility of storing all these massive data, the difficulty of partitioning them into homogeneous groups without knowing a priori the number of clusters, or the need to produce these clusters in real time. In this work, we propose a distributed parallel approach to solve the problem of scaling external clustering validation approaches to allow the use of large data setsby considering the following index: Jaccard coefficient. To do this, we will use the Hadoop platform which is one of the best Big Data platforms and relies on the MapReduce paradigm.The results obtained show the validity of the models developed on the Hadoop platform.ar
dc.identifier.urihttp://hdl.handle.net/123456789/6933
dc.language.isofrar
dc.publisherUniversité Oum El Bouaghiar
dc.subjectBig Dataar
dc.subjectClusteringar
dc.subjectClusteringvalidationar
dc.subjectExternal validationar
dc.subjectJaccard coefficientar
dc.titleValidation de clustering des donneés dans un contexte big dataar
dc.typeOtherar
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
mémoire final.pdf
Size:
4.75 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: