How We Identified All 9,716,205 Products Sold on Shopify

Shopify is the leading cloud-based ecommerce platform with an astonishing market cap of 10 billion dollars. According to their 2016 annual report they claim to have 430,000 merchants.

Since all stores they host use images, files and assets hosted on the Shopify CDN we can easily identify all domains using Shopify as they contain the string “cdn.shopify.com“.

Capture

Our June web crawl contains over 3 billion pages, and we were able to uncover a total of 94,135 unique domains and 100,96,766 pages hosted by Shopify, with a total of 9,716,205 unique products.

That’s a lot of products! But, we wanted to dig even deeper so we extracted product attributes from each page.

The most expensive products for sale

Since shop owners can set any price, there are a number of listings priced at over $1,000,000 that are not true products and are either test listings or fake product listings. For this analysis we looked at products priced around $500,000.

The most expensive products appear to be real estate (who knew you could buy a house on Shopify?), jewelry, artwork, rare domain names and cars. You can even find a 1.8 petabyte storage solution and fusion machine for sale.

ProductPricePageDomain
Toren 150 Myrtle Avenue - For Sale / Studio / Downtown Brooklyn $496,000 https://www.themodernagent.com/products/toren-150-myrtle-avenue-new-yorkwww.themodernagent.com
Audemars Piguet Jules Audemars Grand Complication 25866OR.OO.D002CR.02 $494,380 https://www.pacificbaywatch.com/products/audemars-piguet-jules-audemars-grand-complication-25866or-oo-d002cr-02www.pacificbaywatch.com
Louis Glick Starburst Diamond Ring $475,000 https://www.osterjewelers.com/products/louis-glick-5-13ct-starburst-diamond-ringwww.osterjewelers.com
Emerald Columbian $475,000 https://jameselliot.com/products/ring-extraordinary-emerald-naturaljameselliot.com
Hearts On Fire Illa Constellation Diamond Bracelet $475,000 https://passionfinejewelry.com/products/hearts-on-fire-illa-constellation-diamond-braceletpassionfinejewelry.com
REDOUTE Pierre Joseph (1759-1840). An Original Watercolour of a Bouquet of Red Rose of Sharon. 1835. $455,000 https://www.aradernyc.com/products/redoute-pierre-joseph-1759-1840-an-original-watercolour-of-a-bouquet-of-red-rose-of-sharon-1835www.aradernyc.com
Emerald Cut Diamond = 12.57 ct VS1 L and 2 Shields Platinum Ring GIA # 2155746063 DX0744 SMNTX0024 $437,254 https://goldsteindiamonds.com/products/emerald-cut-diamond-12-57-ct-vs1-l-and-2-shields-platinum-ring-gia-2155746063-dx0744-smntx0024goldsteindiamonds.com
Monrovia Media Cabinet $425,000 https://www.mortisetenon.com/products/monrovia-media-cabinetwww.mortisetenon.com
Early Paul Evans Studio Forged Front Cabinet 1964 $425,000 https://theexchangeint.com/products/early-paul-evans-studio-forged-front-cabinet-1964theexchangeint.com
Patek Philippe 5029J Minute Repeating Limited Edition Watch $425,000 https://patekmonger.com/products/patek-philippe-5029j-minute-repeating-limited-edition-watchpatekmonger.com
Nautilus-E24 $392,000 https://storagefoundry.net/products/nautilus-e24storagefoundry.net
Ava Pendant White by TECH Lighting - FJ Freejack (male adapter only no ceiling canopy) / Satin Nickel / 12V Halogen $383,200 https://www.loftmodern.com/products/tech-lighting-ava-pendant-whitewww.loftmodern.com
Finibus Bonorum et Malorum - Red / L $380,000 https://ap-super-market-2.myshopify.com/products/finibus-bonorum-et-malorumap-super-market-2.myshopify.com
Overwatch Mei Climatologist Role Game Anime Cosplay Costumes RC-1021 - S / Full Set / Female $375,032 https://www.mycosplayer.com/products/overwatch-mei-climatologist-role-game-anime-cosplay-costumeswww.mycosplayer.com
18th Century Boiserie from a French Chateau Complete Room $370,000 https://ellenwardscarboroughantiques.com/products/18th-century-boiserie-from-a-french-chateau-complete-roomellenwardscarboroughantiques.com
45W - "NL-MH" Post Top Lamp- 1000 Pack $359,950 https://www.wholesaleled.com/products/45w-nl-mh-post-top-lamp-1000-packwww.wholesaleled.com
36"W Marquee Chandel-Air $352,000 https://smashingstainedglass.com/products/165603smashingstainedglass.com
Burma Ruby and Diamond Bracelet - Gold $345,000 https://www.duncanandboyd.com/products/burma-ruby-and-diamond-braceletwww.duncanandboyd.com
Nacre Noa Necklace - Black Pearl $340,000 https://antoniabee.com/products/nacre-noa-necklaceantoniabee.com
MegaMc® 1648 HF 220-240V 50/60Hz 3 Phase Fusion Machine Package $331,901 https://www.mcelroyparts.com/products/a4800806www.mcelroyparts.com

What are the top currencies used on Shopify?

According to Shopify’s prospectus, their total addressable market in its key geographies is 10 million merchants.

No surprise that USD is the top currency (although Shopify is based in Canada), but there are over 3 million products for sale in other currencies, proving Shopify is seeing growth all over the world.

CurrencyProducts Available
USD 6,601,417
GBP 849,999
CAD 526,354
AUD 481,324
EUR 303,615
INR 158,615
NZD 92,605
JPY 80,765
DKK 67,658
SGD 66,005
MXN 49,764
ZAR 30,115
HKD 23,742
TWD 17,458
NOK 14,484

Which domains sell the most products?

Some merchants don’t carry any inventory and are instead using Shopify for drop shipping, something Shopify actively encourages. Drop shipping allows anyone to take orders for products on Shopify, then turn around and place those orders on behalf of the customer on sites like Alipay.

We identified a number of merchants with over 1,000 products for sale, showing Shopify sellers are more diverse than just small business owners.

DomainProducts Available
solarisjapan.com 73,096
www.inetvideo.com 29,848
www.actionvillage.com 29,636
www.vogily.com 27,426
www.capitalbooksandwellness.com 24,882
shopvida.com 23,631
battlebeavercustoms.com 18,823
michaels.com.au 18,605
www.vinylexchange.co.uk 16,581
www.helixcamera.com 16,152
www.libris.dk 15,185
www.bulbamerica.com 13,976
memorydealers.com 13,832
www.luxedh.com 13,702
www.zarinfabrics.com 13,476
www.outerinner.com 11,358
www.rugandhome.com 11,342
www.mobilcovers.dk 11,193
littlebirdelectronics.com.au 11,170
www.printpit.co.uk 11,104
www.skateamerica.com 11,037
www.citiesocial.com 11,037
monstervacuum.com 10,682
www.myuglychristmassweater.com 10,649
www.annieandco.com 10,348

The most popular domains hosted on Shopify

We looked a the number of links each Shopify store received on the web and calculated PageRank using Apache Spark.

DomainRank
thehundreds.com 3,111
abookapart.com 3,613
mcphee.com 6,376
fredflare.com 6,735
topatoco.com 8,512
parishilton.com 8,924
nineteeneightyeight.com 10,998
theblackkeys.com 11,103
heroarts.com 11,583
mondotees.com 13,609
satechi.net 14,149
fifthelementonline.com 14,759
shop.pimoroni.com 15,971
teepeerecords.com 16,208
sublimestitching.com 17,034
fivefingerdeathpunch.com 17,665
rockabyebabymusic.com 18,127
ddpyoga.com 18,361
funko.com 18,837
20x200.com 20,286
mewithoutyou.com 21,168
mathsgear.co.uk 21,676
smsaudio.com 22,440
dermae.com 22,606
beyondword.com 22,613
21drops.com 23,169
eyecandys.com 23,573
store.roosterteeth.com 23,609
philippefaraut.com 25,545
music.amnestyusa.org 25,897
flyingout.co.nz 25,926
hennessyhammock.com 26,354
audiodamage.com 26,361
spacepaintings.com 26,780
grantcardone.com 26,826
thebalm.com 26,916
waffleflower.com 27,151
johnnycupcakes.com 27,238
tenderlovingempire.com 27,256
lafiestadeolivia.com 27,784
ine.com 28,052
store.iam8bit.com 28,216
hengedocks.com 28,437
gadgetsandgear.com 28,549
posterchildprints.com 28,721
beardhead.com 28,822
hakshop.com 29,027
customslr.com 29,216
cvquiltworks.com 29,232
shop.coolmaterial.com 29,672
bugasalt.com 29,689
data-discs.com 30,280
store.theory11.com 31,178
occmakeup.com 31,193
wintercroft.com 31,284
thebluecrown.com 31,506
howtocakeit.com 31,784
honeygirlorganics.com 31,845
dieselpowergear.com 31,915
abspancakes.com 31,943
monstervacuum.com 31,964
china-cha-dao.com 31,973
the2tails.com 31,986
rhinocameragear.com 32,756
prosperitycandle.com 32,765
snowboardaddiction.com 33,173
sydneyscloset.com 33,178
store.warlordgames.com 34,378
anonymousbitcoinbook.com 34,505
ratheruggedman.net 34,586
shop.fnatic.com 35,011
superscreen.io 35,544
shop.kendamausa.com 36,221
store.yogscast.com 36,247
classroomfriendlysupplies.com 36,393
shop.darlingmagazine.org 36,956
triune-store.myshopify.com 37,533
basick.supplies 37,691
carolinetruerecords.com 38,950
shop.rocketjump.com 39,438
taliaslegacy.org 39,607
shop.kinagrannis.com 40,482
pileofabric.com 40,776
whimsystamps.com 41,869
saddle-creek.com 43,279
ccpvideos.com 43,508
the7line.com 43,912
thimble-art.com 45,173
tacticalresponse.com 48,297
ouchmagazine.com 49,677
deathwishinc.com 50,096
upperplayground.com 50,847
habitatskateboards.com 50,923
bodyglove.com 54,228
thisisground.com 54,667
fixtstore.com 55,143
sharkrobot.com 55,959
ilovepeanutbutter.com 56,477
colourpop.com 57,198
prankpack.com 57,955
topodesigns.com 58,217
theseventhletter.com 58,680
gasparinutrition.com 58,908
driftinnovation.com 59,339
store.dftba.com 59,506
vautecouture.com 59,827
jkdunlimited.com 60,866
thegregorybrothers.com 60,871
benkweller.com 60,872
koraorganics.com 61,163
backtotheroots.com 61,282
skymall.com 61,708
sts9.com 61,855
heydayfootwear.com 62,284
store.nin.com 62,577
freshbrewedtees.com 62,885
teradek.com 63,017
jeffreestarcosmetics.com 63,328
mimoco.com 63,912
qmxonline.com 64,398
ahuva.com 64,638
brandingirons.com 64,652
aqualabtechnologies.com 64,661
mooshoes.com 64,898
healthyisthenewskinny.com 64,972
qredew.com 65,457
columbuswashboard.com 66,907
hopsdirect.com 67,098
24hundred.net 67,112
fanjoy.co 67,150
popculturespot.com 67,182
milestonefilms.com 67,214
serjtankian.com 67,463
zipbuds.com 68,496
zunior.com 68,789
arbutusrecords.com 68,969
kazoos.com 69,356
tameimpala.com 69,802
camerareadycosmetics.com 70,011
cellucor.com 70,133
lagarconne.com 70,308
younggodrecords.com 70,390
shop.krecs.com 72,181
beddys.com 72,385
zero-g.co.uk 72,427
saterdesign.com 72,514
holdfastgear.com 73,354
caljavaonline.com 73,381
handstandspromo.com 73,402
hank3.com 73,573
black-blum.com 73,851
pieceocake.com 74,239
babybrezza.com 74,701
vividaquariums.com 75,371
leahday.com 75,696
minisuit.com 75,915
txdxe.com 76,596
virtualbookworm.com 77,041
billowby.com 77,708
hoodiebuddie.com 78,171
rebdolls.com 78,592
shopvioletvoss.com 78,595
bundlemonster.com 79,036
myairblaster.com 79,840
sendaathletics.com 80,294
buckangelentertainment.com 80,365
shop.yoyoexpert.com 81,082
3diosound.com 81,089
unionvillevineyards.com 81,096
c64audio.com 81,473
mooshwalks.com 81,595
soundiron.com 81,748
loopymango.com 81,967
glamourdolleyes.com 83,016
merrymakersinc.com 83,267
flymenfishingcompany.com 83,284
elhofferdesign.com 83,515
andrewmcmahon.myshopify.com 84,000
serbu.com 84,041
saidthewhale.com 84,616
justnick.myshopify.com 84,811
esqido.com 84,928
rivenrock.com 85,075
pearl-daisy.com 85,079
shop.iso50.com 85,099
domatcha.com 85,112
drdenese.com 85,180
stickerbrand.com 85,284
equicizer.com 85,311
romanatwood.com 85,387
hideitmounts.com 85,488
locobeauty.com 85,497
adamsaaks.net 85,522
tykables.com 85,571
home.lauren-elainedesigns.com 85,653
pharmaskincare.com 85,668
craftandbaby.com 85,786
publicschoolnyc.com 86,301
store.2600.com 86,511

 

Try out the search we used for this analysis on our search engine or contact us about running queries against our crawl index.

About Us

NerdyData provides reports on which websites use a certain piece of source code.

If your competitors have a common piece of code, for example TrendyLibrary.js, all you have to do is search for that term, and we will show you all of the websites who use your competitor’s technology for your sales team to call!

December 2016

How We Found All Of Optimizely’s Clients

For those who aren’t familiar with Optimizely, they are a leader in the growing A/B testing industry.  Amazingly, they’ve managed to get their installation code down to just one single line of JavaScript as pictured below:

image

With one simple query we uncovered a total of 577,395 sites containing that Optimizely JavaScript library:

2017-01-11_1005

That’s a lot of clients! But, we wanted to dig even deeper and find all distinct Optimizely CDN URLs which contain their Optimizely client numbers. Using a regular expression search we were able to extract a list of over 12,000 URLs used on the top 1 million sites.

2017-01-11_1012

Try out this and other awesome search tools within our search and regular expression interfaces.

About NerdyData

Our search engine is different from search engines you’ve used before. Traditional search engines are geared towards providing answers, whereas our goal is to give you the best list of results for a query.

Our crawler has visited over 140 million homepages and collected terabytes of HTML, JavaScript, and CSS code. We’ve also designed several search interfaces that allow anybody to query against the source code of webpages, or download a list of sites containing a specific term.

//

October 2016

How We Found Every Single Vulnerable Website

If you’re a security researcher and you’ve found an exploit in a commonly distributed web application, you may want to find sites that contain that vulnerable application so you can notify them.

The question is how do you find them?

image

Google Hacking Is Now Obsolete

Maybe you’ve heard of Google Hacking, a technique hackers use to find websites that contain a common filename or block of text that is present in a vulnerable piece of software by searching to find all sites containing them.  An example of this would be a Google query like

inurl:administrators.pwd

or

Powered by XOOPS 2.2.3 Final

If you are familiar with this method of vulnerability hunting, or this sort of thing interests you, you’ll be excited to know we’ve taken Google Hacking to another level.

How Does This Method Differ?

Traditional search engines only let you query the text of a webpage, not the markup. You can now find all websites that have a common piece of HTML code or JavaScript, in addition to a block of text. Here are some examples of what can done:

Websites running WordPress that are using version 3.5

Query: <meta name="generator" content="WordPress 3.5" />

imageClick to see query results

Websites with an upload form on their homepages

Query: name="MAX_FILE_SIZE"

imageClick to see query results

Websites using the Invision Power Board Forum

Query: ipsBadge

imageClick to see query results

New flaws in web application security measures are constantly being researched, both by hackers and by security professionals. Most of these flaws affect all dynamic web applications whilst others are dependent on specific application technologies.

In both cases, one may observe how the evolution and refinement of web technologies also brings about new exploits which compromise sensitive databases, provide access to theoretically secure networks, and pose a threat to the daily operation of online businesses.

//

March 2016

Mixpanel Vs. Goliath

In a vast sea of analytic platforms, how many users choose Mixpanel over the competition?

image

It takes just 5 minutes to setup, and once you start watching the real-time data flood in, it’s clear that Mixpanel is not only the most “modern” and sleek analytics platform to-date, but also provides a unique take on customer-oriented statistics and insight.

This isn’t a blog post about why Mixpanel is better — instead we want to show you some interesting statistics that exemplify the uphill battle Mixpanel faces in competing with the analytics juggernauts.

There is no denying that Google Analytics and Omniture dominate the online analytics industry. But just how big are they?  We researched the topic and found:

After seeing these numbers we thought, “Well, Mixpanel has a low adoption rate among all webmasters, but maybe their target market is larger web companies”.  So we narrowed our search to just the top 1 million sites on the internet (based on traffic)  Mixpanel appears on just 540 websites out of the top one million. 

How hard would it be for Mixpanel to convince Google Analytics users to make the switch?

Upon further inspection we found 87% of domains that have Mixpanel code also use Google Analytics.

It’s tough to get out of Google’s shadow, so how will Mixpanel convince webmasters to pick them as their primary analytic platform?

About NerdyData

Our crawler has visited over 140 million homepages and collected terabytes of HTML, Javascript, and CSS code. We’ve also designed several search interfaces that allow anybody to query against the source code of webpages, or download a list of sites containing a specific term.

//

June 2014

How Facebook Tricks Webmasters To Collect Users Web Surfing History

image

With the recent announcement that Facebook will begin selling your web browsing history to advertisers, we thought we’d take a look at how they actually get your web browsing history in the first place.

Most people assume that Facebook tracks them when on facebook.com, but you don’t have “Facebook” installed on your computer and you don’t “open up Facebook” to surf the web.  Where do they get data from?

Even without visiting facebook.com, plus.google.com, or twitter.com, you’re likely to encounter elements from these sites almost seven times a day. The trackers come in the shape of cookies, JavaScript, 1-pixel beacons, and Iframes, and cute looking widgets.

These elements have the ability to ping Facebook’s servers with:

  • The URL of the page you’re viewing
  • The site that referred you to that page
  • The browser you’re using
  • The OS you’re using
  • Your approximate geographic location
  • The size of your screen
  • If you’re logged into Facebook they can associate you with your Facebook profile.

The Facebook Like Button

One very popular widget on the internet is the Facebook like button. Facebook’s Like button has made it easy for hundreds of millions of Web users to share content with their friends on the social networking site. The button appears on more than one-third of the top thousand websites and has been integrated into everything from Bing search results to countless blogs around the ‘net. What users may not realize is that the soft blue thumbs-up is tracking their surfing habits, even if it doesn’t get clicked.

image

Any time the Like button is displayed, information is zapped back to Facebook’s servers.

Facebook Connect and Your Privacy

Facebook Connect is the next iteration of the Facebook Platform that allows users to “connect” their Facebook identity, friends and privacy to any site. Even if you never login to a site using Facebook Connect, the fact that they have the Facebook Connect JavaScript snippet present on their site means Facebook can see that you are present on that site.

Over 50,000 sites use Facebook Connect, and if you’ve visited one of them, you’ve been tracked.

image

Like Boxes Are Creeping On You Too

image

The Like Box is a special version of the Like Button designed only for Facebook Pages. It allows admins to promote their Pages and embed a simple feed of content from a Page into other sites. As this is a JavaScript widget, every time it is loaded it pings information about you back to Facebook servers.

We found over 1 million websites that have this box.  (and additionally show pictures of the followers faces)

image

What Can you Do About It?

Twitter and Pinterest, which track people with their Tweet and PinIt buttons, offer users the ability to opt out. And Google has pledged it will not combine data from its ad-tracking network DoubleClick with personally identifiable data without user’s opt-in consent. Facebook does not offer an opt-out in its privacy settings.

Instead Facebook asks members to visit an ad industry page, where they can opt out from targeted advertising from Facebook and other companies. The company also says it will let people view and adjust the types of ads they see.

September 2013

How To Find New Clients For Your SEO Agency

NerdyData is a search engine for source code.  This post outlines some ways an SEO agency can use our tool to discover potential new clients, en masse.

It’s a gold rush out there for SEO agencies. As businesses come online in droves, they quickly discover that simply paying someone to develop a website will not get you the traffic you need to be profitable. Everyone wants to be at the top of a hot Google search. A criminal attorney in San Francisco who ranks for criminal attorney in san francisco will likely receive many contacts from people interested in legal representation.

Only a small percentage of websites show up in a top placement in organic search results for popular queries.  There are millions of websites that exist, but are are not optimized in a way that will make them appear for these frequently searched keywords, and so they are displaced by those that do optimize.

image

An SEO agency exists to bridge the gap between Google’s search algorithm and technologically unsavy business owners.            

We have come up with some ways an SEO agency can surface these poorly optimized sites using our search engine. Here are some examples: 


Search for sites that have “niche” and “location” in their <title> tag or on-page text, but DO NOT have a meta description tag

  • If you’re an SEO agency you could use this type of search to narrow down sites owned by “criminal attorneys” in “san francisco” that most likely doesn’t have an SEO agency because they lack a meta description tag on their web pages.

Additionally, we’ve made a number of tools that let you search within the <title> and Meta Descriptions of websites.


Search for sites that don’t have Facebook or Twitter badges, buttons, or social links on their pages.

  • There’s a good chance these sites do not have an online social presence.  Why don’t they?  These businesses could find new customers by creating a social media presence, but may not know how to create one.


Search for sites that use outdated or poorly optimized software

  • Many small business websites are using a version of a CMS, forum, or blog software that is not optimized for high volume queries in Google.  These sites are likely to already contain content, but are not designed in a way that allows them to capture search traffic for terms relevant to their business.

If you want to perform searches like these, try out NerdyData, a search engine that indexes the full source code of webpages and let’s your query using code snippets, as well as keywords.

Additionally, you can submit a request through this form and we can get in touch with you to help you uncover new business leads for your agency.

Or follow us on Twitter!

//