Skip to content

ndanevski1/Set-Similarity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Set-Similarity

This is a group project done as part of my Undergraduate Research Course. Due to University policy, I am not allowed to share code publicly. Please contact me further regarding this.

Abstract

Set similarity is a pervasive concept in mathematics and computer science which is applied extensively in the databases and data mining fields. There are multiple different measures of set similarity. In this paper we do comparative testing on the most notable and commonly used ones.

An evalutaion of set similarity measures by first translating the problem of set similarity from its abstract set theory domain to a concrete graph theory problem and then exhaustively testing the most common set similarity measures. The results show that using the Tversky index as a similarity measure yields the best results. This measure depends on two coefficients, α and β, and we find their optimal values to be 1.0 and 0.01, respectively.

About

This is a group project done as part of my Undergraduate Research Course. I have only the final paper as a presentation for this project because due to University policy, I am not allowed to share code publicly. Please contact me further regarding this.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors