• Home
  • About Us
  • Contact
  • Advertise
  • Awards
  • Privacy Policy
  • Twitter
  • Facebook
  • RSS
TheDomains.com

AT & T Researchers Look at Spotting Malicious Domains Using Word Segmentation

June 16, 2015 by Raymond Hackney

Two researchers from AT & T, Wei Wang and Kenneth E. Shirley had a domain related paper published on the Cornell University website ArXiv.org. The paper looks at whether using certain word patterns and data can be useful in spotting malicious domains.

From the paper:

In recent years, vulnerable hosts and maliciously registered domains have been frequently involved in mobile attacks. In this paper, we  explore the feasibility of detecting malicious domains visited on a cellular network based solely on lexical characteristics  of  the domain  names. In addition to using traditional quantitative features of domain names, we also use a word segmentation algorithm to segment the domain names into individual words to greatly expand the size of the feature set.

Experiments on a sample of real-world data from a large cellular network show that using word segmentation improves our ability to detect malicious domains relative to approaches without  segmentation,  as  measured  by  misclassification  rates and  areas  under  the  ROC  curve.  Furthermore,  the  results  are interpretable, allowing one to discover (with little supervision or tuning required) which words are used most often to attract users  to  malicious  domains.  Such  a  lightweight  approach could be performed in near-real time when a device attempts to  visit  a  domain.  This  approach  can  complement  (rather than  substitute) other more expensive and time-consuming approaches to similar problems that use richer feature sets.

Among the largest  400  out  of  these  5327  coefficients  (i.e.  those most strongly  associated  with  maliciousness) were several words that fell into  groups of related words, which we manually labeled in the following list:

1) Brand names: rayban, oakley, nike, vuitton, hollister,timberland, tiffany, ugg
2) Shopping:dresses, outlet, sale, dress, offer, jackets,
watches, deals
3) Finance: loan, fee, cash, payday, cheap
4) Sportswear:jerseys, kicks, cleats, shoes, sneaker
5) Basketball Player Names (associated with shoes):kobe, jordan, jordans, lebron
6) Medical/Pharmacy:medic, pills, meds, pill, pharmacy
7) Adult:webcams, cams, lover, sex, porno
8) URL spoof: com

Read the full paper on ArXiv.org

Filed Under: Domains

« Uniregistry Announces Sale Of Pro.Flowers For $50K To FTD
Sedo Weekly Sales Come in Strong at $2.2million led by NKB.com »


Recent Articles

  • 2to3 – InterNetX’s Gateway to Domain Tokenization
  • Name.com partners with Bolt,Netlify and Vercel
  • Sedo weekly domain name sales led by ALA.xyz

Recent Comments

  • Peter on This Wednesday tune in on X to hear David Castello chat with Brady from Unstoppable Domains
  • Mike Robertson on TheDomains.com turns 18
  • Raymond Hackney on This Wednesday tune in on X to hear David Castello chat with Brady from Unstoppable Domains
  • Raymond Hackney on TheDomains.com turns 18
  • Raymond Hackney on TheDomains.com turns 18

Categories

Archives

Copyright ©2025 TheDomains.com