Topical Crawl Use Cases for Different Industries

In addition to numerous SEO related usages as explained in Quick SEO tutorial, here you can find examples of a few, more detailed, successful but also some unsuccessful topical crawl use cases.

advertising

ADVERTISING

Feasibility study for a drug coupons startup.

architect

ARCHITECTURE

SEO analysis of the niche.

business

BUSINESS

Discovery of prospect clients.

classifieds

CLASSIFIEDS

Discovery of biggest IT job sites.

classifieds

CLASSIFIEDS

Aggregation of fresh ads from all over the web.

ecommerce

ECOMMERCE

Competition analysis.

education

EDUCATION

Online language resources.

food

FOOD

Recipe search engine.

home-garden

HOME & GARDEN

Resource discovery.

internet marketing

INTERNET MARKETING

Backlink analysis of cord blood niche.

internet marketing

INTERNET MARKETING

Backlink analysis of SEO niche.

news

NEWS

Crawl news from tens of thousands of sources.

recreation

RECREATION

Discover relevant resources.

manufacture

RESEARCH

Finding specific manufacturer.

travel

TRAVEL

Discover pages for link building.


ADVERTISING


Client: startup company
Offering: drug coupons
Domain: marketing & advertising
Challenge: discover only fresh and drug related coupons
Solution: extremely hard, requires much backend custom development
Plan: pro
Expert setup: yes

Topic is not at all clearly defined, and overlaps with several other topics. In particular, it is very hard to distinguish drug coupons from other coupons as relevant pages use very little text, and often almost the same text. Also, it is hard to detect only valid coupons as they expire quickly on respective sites, and may not be valid by the time information is served to end user.

drug coupons

ARCHITECTURE


Client: architecture studio
Offering: architectural services
Domain: architecture
Challenge: discover relevant resources for off-page SEO
Solution: simple crawl setup & SEO link tools analysis
Plan: basic
Expert setup: no

A winner of several architectural prizes, architecture studio wanted to do proper changes to the site and to find out what could be done off-site in order to achieve higher rankings in search engines.

architect

BUSINESS


Client: contract research organization (CRO)
Offering: expert services to medical laboratories
Domain: Biotechnology
Challenge: discover CROs relevant to Alzheimer, neurodegenerative, and brain disorders
Solution: easy crawl setup and custom SQL query
Plan: basic
Expert setup: yes

Client needed to discover as many relevant CROs who could have use of their expert services. Because of how those sites write about their research and activities, traditional search engines could not easily return relevant information. While topic is not clearly defined and results retreive noise, insight that final results will be few and easy to manually verify, enabled us to use this simple setup crawl. Custom query, which spanned search accross coupled pages from same domains, was able to retrieve a list of several dozen relevant resources, a decent number in this highly specialized niche.


Challenge 2: discover pharmaceutical companies developing and testing new drugs for Alzheimer's disease
Solution 2: crawl setup and custom SQL query

Client needed to discover all pharmaceutical companies doing research for new drugs for AD. Crawl was easy to setup as the topic is well defined. Like above, a single custom query which spanned search accross coupled pages from domains was able to retrieve a list of several dozen resources, yet manual inspection of the list did not retreive enough satisfying links. Distinguishing companies which work with new drugs from those that work with old drugs was not successful.

business development

CLASSIFIEDS


Client: IT jobs search engine startup
Offering: IT search & freelance collaboration platform
Domain: IT
Challenge: discovery of fresh ads from all over the web
Solution: few step setup and periodic crawl with delivery of only fresh data
Plan: pro
Expert setup: yes

Project required setup of a continuously running topical crawler for discovery of jobs in IT industry. This task has several specific but common requirements - crawler needs to discover new content on the web often while ignoring old content. To achieve this we had to run several 'preparation' crawls and analysis with SEO link tools, in order to discover best hub pages that generate fresh links. Things to keep in mind with such crawls are: number of links on the web is literally infinite, and even if you discover a new page, or new link on the old page, it may not be really new page on the web but only new page you discovered. Project needs to discover only new pages on the web - job ads posted recently. Therefore we need to control URL frontier that crawler (re)visits! We do this by setting explore seeds frontier option to 'new links only'.

IT Job Search

CLASSIFIEDS


Client: garage sales search engine
Offering: yard sale search
Domain: shopping
Challenge: discovery of fresh ads from all over the web
Solution: very hard iterative setup and periodic crawl with delivery of only fresh data
Plan: pro
Expert setup: yes

Like another classifieds use case mentioned above, this task deals with discovery of only new ads. In addition, this crawl was very difficult to setup because of the topic which is not clearly defined in terms of word statistics, as few other topics use very similar words. Even though final results are quite good, additional processing on the client side is required.

Yardsale search

ECOMMERCE


Client: small business owner
Offering: stoves
Domain: home & garden
Challenge: discover relevant resources and analyze competition
Solution: simple crawl setup & SEO link tools analysis
Plan: basic
Expert setup: yes

This clearly defined topic was easy to setup and results were informative for analyzing marketing strategy.

Generic placeholder image

EDUCATION


Client: language learning website
Offering: online language resources
Domain: language translations, word games
Challenge: provide relevant resources to visitors
Solution: simple crawl setup
Plan: pro
Expert setup: yes

A site about languages, word games, and linguistics needed a good 'resources' page. It had a custom search page which used API to get results from our topical crawl.

language learning

FOOD


Client: recipe search engine
Offering: recipes by ingredients and health conditions
Domain: food & health
Challenge: find recipe pages from both large and small sites
Solution: advanced, occasionally recurring crawl
Plan: pro
Expert setup: yes

Quite unique search engine wants to expand its recipe base to practically unlimited sources. It integrates our data with its existing ingredient parsing technology.

language learning

HOME & GARDEN


Client: exterminator
Offering: pest control
Domain: home & garden
Challenge: discover relevant resources and analyze competition
Solution: simple crawl setup & SEO link tools analysis
Plan: pro
Expert setup: yes

This was a typical use case with a straigh-forward solution. Topic is well defined, crawl was easy to setup, and SEO tools reveal best sites in the niche.

home and garden

INTERNET MARKETING


Client: SEO consultant
Offering: SEO services
Domain: Cord Blood
Challenge: discover useful and actionable information in the niche
Solution: semi-hard crawl setup & SEO link tools analysis
Plan: pro
Expert setup: yes

This is a typical use case for SEO professionals. Client needed competitive intelligence in the niche. Niche is not entirely well defined as words overlap with few other topics, so a setup by an expert was required. Once the crawl was set and finished, SEO link analysis tools enabled discovery of best sites, best pages, common phrases, and other useful information.

Generic placeholder image

INTERNET MARKETING


Client: Chicago SEO company
Offering: SEO services
Domain: SEO
Challenge: discover useful and actionable information in the niche
Solution: average difficulty crawl setup & SEO link tools analysis
Plan: pro
Expert setup: yes

This is another typical use case for SEO professionals. In a specific public crawl 40 links from "Chicago SEO" pages were used as examples. While SEO topic is well defined, location Chicago is not exclusive to results. While relevant content was followed and gathered by crawler regardless of geolocation, finding only Chicago related domains is easy through SEO link tools. Reducing crawl to Chicago only pages would be very tricky, and from the standpoint of topical relevance questionable as many pages from Chicago SEO companies do not mention Chicago. "Cleaning up" the graph would require either specification of content phrases, or run of two crawls - first for general SEO topic, second with crawl over a list of discovered Chicago related sites in the first crawl. From the standpoint of graph analysis however, this scenario would not yield representative results, and current "fuzzy" solutions is much more appropriate.

Generic placeholder image

NEWS


Client: news portal
Offering: diversified perspectives on a topic
Domain: news
Challenge: discover fresh content in tens of thousands of news sources
Solution: few step crawl setup and periodic crawl with delivery of only fresh data
Plan: business
Expert setup: yes

This project required a general crawl over defined set of domains. Only news related domains were selected, crawl frontier defined, and news portal uses API to continously retrieve fresh news.

Generic placeholder image

RECREATION


Client: website owner & hicking tour organizer
Offering: hicking tours accross Serbia
Domain: travel & recreation
Challenge: discovering all relevant resources within Balkans
Solution: simple crawl with few custom SQL queries & SEO link tools analysis
Plan: basic
Expert setup: yes

A hicking enthusiast from Serbia wanted to know all the sites about hicking in Balkan mountains so that he could explore competition and places where people talk about hicking. Client has a website and needs to promote it. Analysis of the niche helped with decision making for future plan of actions. Topic is well defined and was easy to setup.

Generic placeholder image

RESEARCH


Client: product designer & inventor
Offering: heart plate
Domain: food, eating habits, & health
Challenge: finding US based custom shaped ceramic plate manufacturer
Solution: no US manufactuers found
Plan: basic
Expert setup: yes

A top ranked designer and award winning inventor wanted to support US economy by finding US based manufacturer of custom shaped ceramic plates, and could not do it with well known search engines. This is very tricky topic for a topical crawler as it is not clearly defined in the sense of word statistics. Its boundary overlaps greatly with a few other topics and therefore results have much noise. After a number of iterations and more advanced graph exploration with custom queries that spanned search accross coupled pages from domains, we came up with conclusion that there indeed may not be USA based manufacturers of custom ceramic plates.

Generic placeholder image

TRAVEL


Client: website owner
Offering: private accomodation
Domain: travel & tourism
Challenge: discover relevant resources for link building
Solution: few iterations of crawl setup parameters, and SEO link tools analysis
Plan: basic
Expert setup: yes

This is another typical use case for a not so well defined topic. While client wanted region specific content discovery, word statistics in the topic is such that tourism and accomodation words overweighted location specific words. Additional content phrases were needed for more accurate analysis of results. Prospect pages backlink tool was used to find relevant web pages with a few outgoing links to various domains, indicating possible link building opportunity.

Generic placeholder image

Customizable & Flexible Crawling

Our highly customizable crawler enables easily configurable setup for most diverse scenarios:

contact | terms | privacy
© 2017 semanticjuice.com