Crawlomatic Multisite Scraper Post Generator Plugin For WordPress

Crawlomatic Multisite Scraper Post Generator Plugin for WordPress

crawlomatic promo

scraper

JavaScript execution support for crawled pages!

wp6
envato elite
WP Requirements Compliant badge
support

What Can You Do With This Plugin?

Crawlomatic Multisite Scraper Post Generator Plugin for WordPress is a breaking edge web site crawling and scraping, put up generator autoblogging plugin that makes use of web site crawling and scraping to show your web site right into a autoblogging or perhaps a cash making machine!
Get content material from nearly any webpage! You now not want API’s which requires registration and supplies restricted entry, additionally you may retrieve knowledge from non API offering web sites. Schedule it for as soon as and let it autopilot your posts 7/24 for you want a grasp!

How does it work?

This plugin will crawl the seed URL you give it (crawling means that it’ll search all hyperlinks that the webpage incorporates) and can go to and extract content material from every crawled URL. The crawling course of is customizable: you may set the crawling depth, crawling charge, most crawled article rely, crawl solely hyperlinks with particular class or ID and lots of extra customizations.

Crawlomatic v2.0 replace

In the v2.0 replace, a brand new dwell scraper shortcode was added to the plugin: [crawlomatic-scraper]. This new function makes this plugin a straightforward to implement net knowledge extractor for WordPress. As a consequence, it may be used to show real-time knowledge from any web sites straight into your posts, pages or sidebar. It additionally quickly caches the scraped content material, so your web site is not going to over use on sources. You can use this plugin to incorporate real-time inventory quotes, cricket or soccer scores or every other generic content material from public domains!

New options included on this replace:

Scraped output could be displayed by way of customized template tag, shortcode in web page, put up and sidebar (by way of a textual content widget).
Configurable caching of scraped knowledge. Cache timeout could be outlined in minutes for each scraped knowledge.
Configurable Useragent for your scraper could be set for each scrape.
Configurable default settings like enabling, useragent, timeout, caching, error dealing with.
Multiple methods to question content material – CSS Selector, XPath or Regex, Auto Detection.
A variety of arguments for parsing content material.
Option to go put up arguments to a URL to be scraped.
Dynamic conversion of scraped content material to specified character encoding to scrape knowledge from a web site utilizing completely different charset.
Create scraped pages on the fly utilizing dynamic era of URLs to scrape or put up arguments based mostly in your web page’s get or put up arguments.
Callback operate for superior parsing of scraped knowledge.

Check the official documentation of the v2 update, flick through examples and verify FAQ for crafting a wonderfully optimized net scraper.

More in regards to the plugin

You can scrape content material from nearly each web page that you just open in your browser. If the content material is loaded utilizing JavaScript, the plugin could be mixed with PhantomJS to scrape additionally JavaScript generated content material.

Also, you may robotically generate limitless variety of customized web site crawling and scraping.

Other plugin options:

v2.5.5 replace: Automatically replace scraped posts/pages/merchandise if the supply web site modifications + unpublish (set as draft) the put up/web page/product if the scraped URL is now not out there on the supply web site (elective options, could be enabled/disabled)
v2.5.1 replace: Scrape WooCommerce product variants from different WooCommerce/Shopify shops
v2.5.0 replace: Scrape search engine outcomes for your customized key phrase searches, from Google or from Bing. Check the tutorial video of this new feature.
v2.4.1 replace: Scrape product picture galleries for WooCommerce merchandise (for non-product put up sorts, put up attachments can be created from the scraped pictures)
v2.3.5 replace: Execute your personal JavaScript code on the scraped HTML and scrape the outcomes – this function is obtainable solely when headless browsers are used for scraping (Puppeteer/Tor/PhantomJS) or HeadlessBrowserAPI
v2.2.1 replace: Crawl RSS feeds for hyperlinks and scrape articles listed in them
v2.2.0 replace: Use HeadlessBrowserAPI to scrape JavaScript Generated HTML Content from any web site on the web with out the necessity to set up something (moreover this plugin) in your server – tutorial video
v2.1.0 replace: Scrape .onion web sites from the Dark Web utilizing the Tor Browser and Puppeteer! – tutorial video
v2.0.0 replace: Live Scraper shortcode added for much more crawling management and scraping energy: [crawlomatic-scraper]
v1.7.1 replace: Sitemap crawling supported – video tutorial
v1.6.5 replace: Visual content material selector help added – video tutorial
v1.6.0 replace: Added the power to make screenshots of crawled pages and use them in generated put up’s content material – video tutorial
v1.5.2 replace: Ability to shorten outgoing (put up supply) hyperlinks (and monetize them), utilizing Shorte.st hyperlink shortener service – example of shortened link
v1.4.8 replace: Added JavaScript execution help for crawled pages – requires PhantomJS put in on server – How to install PhantomJs? – video tutorial
v1.4.4 replace: Added the power to set a number of proxies for crawling pages. The plugin will choose one at random at every web page entry
v1.4.0 replace: Added the power to paginate crawling (crawling for articles will proceed on the following web page of the seed web page).
v1.4.0 replace: Added the power to import product costs for crawled merchandise (WooCommerce appropriate) + dropshipping value automated modification – video tutorial
v1.4.0 replace: Added the power to extend imported product value by a hard and fast quantity or to multiply it with a predefined quantity (nice worth for dropshipping!)
v1.2.8 replace: Added paginated put up importing help (right into a single crawled put up) Check: VIDEO.
v1.2.4 replace: Added the power to set proxies for crawling pages
v1.2.3 replace: Added an choice to crawl the web page from Google cache when direct crawling fails (blocked)
Google Translate help – choose the language wherein you need to put up your articles
Text Spinner help – robotically modify generated textual content, altering phrases with their synonyms – built-in, The Best Spinner, SpinRewriter, WordAI, TurkceSpin and others – nice website positioning worth!
customizable generated put up standing (revealed, draft, pending, personal, trash)
shortcode to record all posts generated by this plugin: [crawlomatic-list-posts type => ‘any’, order => ‘ASC’, ‘orderby’ => ‘date’, ‘posts’ => 50, ‘category’ => ’’, ‘ruleid’ => ’’]
crawling and scraping could be set to respect the robots.txt information of internet sites and robots HTML headers of scraped pages
robotically generate put up classes or tags from market gadgets
manually add put up classes or tags to gadgets
select if you wish to replace put up whether it is already posted
ship customized cookies with the request to the crawled webpage (authentification)
generate put up or web page or any customized put up kind
embeds movies from YouTube, Vimeo, Flickr, IGN, Ustream.television and DailyMotion utilizing web site crawling and scraping
outline publishing constrains: don’t publish posts that do not need pictures, posts with brief/lengthy title/content material
robotically generate a featured picture for the put up
allow/disable feedback, pingbacks or trackbacks for the generated put up
customise put up title and content material (with the included broad number of related put up shortcodes)
‘Keyword Replacer Tool’ – It’s objective is to outline key phrases which might be substituted robotically together with your affiliate hyperlinks, anyplace they seem within the content material of your web site. For instance, you may outline a key phrase ‘codecanyon’ and have it substituted by a hyperlink to http://www.codecanyon.net/?ref=user_name anyplace it seems in your web site’s content material.
‘Random Sentence Generator Tool’ (related sentences – as you outline them)
choice to robotically delete generated posts after a time period
detailed plugin exercise logging
scheduled rule runs
customized discipline help for generated posts
customized taxonomies help for generated posts
limitless crawled variable importing (limitless imported components of the crawled pages)
possibility to repeat or not pictures regionally
capability to parse JSON knowledge utilizing Regex
possibility so as to add canonical meta tag to generated posts
Maximum/minimal title size put up limitation
Maximum/minimal content material size put up limitation
Add put up provided that predefined required key phrases present in title/content material
Add put up provided that predefined banned key phrases are usually not discovered within the title/content material
Save and restore plugin rule record from file

Testing this plugin

You can check the plugin’s performance utilizing the ‘Test Site Generator’. Here you may strive the plugin’s full performance. Note that the generated testing weblog can be deleted robotically after 24 hours.

Plugin Requirements

PHP DOM -> the best way to set up it (in case you don’t have it, however most likely you have already got it): http://php.net/manual/en/dom.setup.php
PHP 5.0 or larger
dom, mbstring, iconv and json extensions (enabled by default)

For extra information on the best way to configure the plugin, please verify additionally this 1 hour long tutorial video, which covers the total function set of the plugin.

Need help?

Please verify our knowledge base, it could have the reply to your query or an answer for your problem. If not, simply e mail me at support@coderevolution.ro and I’ll reply as quickly as I can.

php8-compatible
updated

Changelog:

Version 1.0 Release Date 2017-08-15

First model launched!

Version 1.1 Release Date 2017-08-16

Fixed some small points

Version 1.2 Release Date 2017-08-17

Added the power to crawl web page by div class or id

Version 1.2.1 Release Date 2017-08-18

Fixed incompatibility with some WordPress installs

Version 1.2.2 Release Date 2017-08-22

Added a shortcode to show put up generated by this plugin

Version 1.2.3 Release Date 2017-08-30

Added an choice to crawl the web page from Google cache when direct crawling fails (blocked)

Version 1.2.4 Release Date 2017-08-31

Added the power to set proxies for crawling pages

Version 1.2.5 Release Date 2017-09-04

Added the canonicalization for generated articles

Version 1.2.6 Release Date 2017-09-13

Made the plugin timezone conscious

Version 1.2.7 Release Date 2017-09-14

Fixed put up date for non gmt blogs

Version 1.2.8 Release Date 2017-09-23

Added paginated put up importing help

Version 1.2.9 Release Date 2017-09-27

Bugfixes

Version 1.3.0 Release Date 2017-09-28

Fixed rule restore

Version 1.3.1 Release Date 2017-10-20

Fixed featured picture era

Version 1.3.2 Release Date 2017-10-22

Added crawling helper

Version 1.3.3 Release Date 2017-11-06

Fixed a reminiscence problem

Version 1.3.4 Release Date 2017-11-07

Bugfixes

Version 1.3.5 Release Date 2017-12-14

Fixed class selector not working in all circumstances

Version 1.3.6 Release Date 2017-12-18

Added the power to specify a customized person agent for every crawled webpage

Version 1.3.7 Release Date 2018-01-20

Added a brand new textual content spinner service: Spinrewriter

Version 1.3.8 Release Date 2018-01-22

Plugin can now constantly import content material

Version 1.3.9 Release Date 2018-02-02

Fixed problem when a number of crawl courses the place specified

Version 1.4.0 Release Date 2018-02-22

Major replace: added the power to crawl imported product costs (WooCommerce appropriate)
Added the power to crawl serial content material (paged crawling - crawling for articles will proceed on the following web page)

Version 1.4.1 Release Date 2018-03-07

Bugfixes

Version 1.4.2 Release Date 2018-03-21

Fixed a reproduction posting problem

Version 1.4.3 Release Date 2018-03-22

Fixed a essential problem with a number of rule operating

Version 1.4.4 Release Date 2018-04-04

Added the power to outline a number of proxies. The plugin will choose one at random at every web page entry

Version 1.4.5 Release Date 2018-07-13

Updated built-in readability module

Version 1.4.6 Release Date 2018-07-16

Critical bugfixes

Version 1.4.7 Release Date 2018-07-19

Added the power to not translate hyperlinks

Version 1.4.8 Release Date 2018-09-05

Added JavaScript execution help for crawled pages - requires PhantomJS put in on server

Version 1.4.9 Release Date 2018-09-18

Bugfixes

Version 1.5.0 Release Date 2018-09-24

Added the power so as to add customized put up taxonomies from crawled content material
Added the power so as to add limitless crawled variables to posts's content material/ meta/ taxonomies

Version 1.5.1 Release Date 2018-10-16

Fixed problem when importing giant pages

Version 1.5.2 Release Date 2018-10-24

Added the power to shorten hyperlinks utilizing Shorte.st

Version 1.5.3 Release Date 2018-10-29

Fixed problem when importing paginated posts

Version 1.5.4 Release Date 2018-11-06

Added the power to strip HTML parts by tag identify (div,a,span,and so forth.)

Version 1.5.5 Release Date 2018-11-07

Added WooCommerce product class creation help

Version 1.5.6 Release Date 2018-12-16

Added nested importing help - import blended content material right into a single put up, from a number of plugins created by CodeRevolution

Version 1.5.7 Release Date 2018-12-16

Added the power to outline a listing of URLs to skip from crawling and importing

Version 1.5.8 Release Date 2019-01-08

Added the power to import royalty free pictures for created posts

Version 1.5.9 Release Date 2019-01-12

Added Gutenberg blocks help

Version 1.6.0 Release Date 2019-02-01

Added the power to make screenshots of scraped pages

Version 1.6.1 Release Date 2019-02-06

Improved compatibility with some crawled pages

Version 1.6.2 Release Date 2019-04-19

Security replace

Version 1.6.3 Release Date 2019-05-15

Fixed some lately discovered bugs with put up pagination

Version 1.6.4 Release Date 2019-05-17

Added help for TurkceSpin content material spinner

Version 1.6.5 Release Date 2019-05-27

Added a a lot demanded new function: Visual Content Selector for assigning scraped web page content material
Added the power to scrape pages from backside to high
Added the power to switch phrases in scraped content material
Other minor bug fixes and performance enhancements

Version 1.6.6 Release Date 2019-07-26

Fixed timeout problem with some crawled pages
Many small points fastened and options improved

Version 1.6.7 Release Date 2019-08-05

Fixed problem with Google Translate

Version 1.6.8 Release Date 2019-11-15

WordPress 5.3 compatibility replace

Version 1.6.9 Release Date 2020-05-11

New options added for content material templates
Bugfix replace

Version 1.7.0 Release Date 2020-07-21

Added help for scraping extra websites

Version 1.7.1 Release Date 2020-09-28

Added the power to crawl sitemaps and to scrape posts linked in them
Added the power to respect the directives set within the robots.txt information

Version 2.0.0 Release Date 2020-12-08

Added a brand new shortcode and Gutenberg block different that can allow dwell scraping of any web site
Major efficiency enchancment
Fixed reported bugs

Version 2.1.0 Release Date 2021-01-02

Added help for utilizing the Tor Browser to crawl darkish web pages! Scrape .onion web sites such as you would scrape every other public web site!

Version 2.1.1 Release Date 2021-01-04

Added the power to crawl and scrape pages utilizing POST requests (POST type submission scraping help)

Version 2.2.0 Release Date 2021-01-14

Added help for HeadlessBrowserAPI to scrape JavaScript rendered content material with ease

Version 2.2.1 Release Date 2021-01-16

PHP 8 compatibility replace
Added help for crawling hyperlinks from RSS feeds

Version 2.2.2 Release Date 2021-01-28

Fixed uncommon problem when saving importing rule settings on some PHP 8 configurations

Version 2.2.3 Release Date 2021-02-01

Improved content material extraction algorithm

Version 2.2.4 Release Date 2021-02-17

Added the power to not spin posts generated by particular guidelines

Version 2.2.5 Release Date 2021-03-07

Added the power to enter a number of URLs (one per line) to be crawled and scraped

Version 2.2.6 Release Date 2021-03-07

Visual Selector enhancements - now will probably be ready to make use of HeadlessBrowserAPI/Puppeteer/PhantomJS/Tor to visualise scrape content material

Version 2.2.7 Release Date 2021-04-02

Fixed uncommon points when crawling hyperlinks with URL parameters

Version 2.2.8 Release Date 2021-04-07

Fixed uncommon points with relative URL paths in crawled content material

Version 2.2.9 Release Date 2021-05-03

Added the power to skip publishing of latest posts if not pictures discovered (individually, for every rule)

Version 2.3.0 Release Date 2021-05-19

Added the power to make screenshots of internet sites utilizing the HeadlessBrowserAPI function

Version 2.3.1 Release Date 2021-06-10

Fixed content material extracting/stripping in case of some web sites with dynamically generated content material

Version 2.3.2 Release Date 2021-07-15

Added a number of Regex expression help (for content material stripping and alternative)

Version 2.3.3 Release Date 2021-07-18

Added SpinnerChief to the supported premium textual content spinners (SpinRewriter, The Best Spinner, WordAI, TurkceSpin)

Version 2.3.4 Release Date 2021-07-19

Added Bing Translator help (subsequent to Google Translator and DeepL Translator)

Version 2.3.5 Release Date 2021-08-06

Added the power to execute your personal customized JavaScript on scraped pages when utilizing headless browsers (PhantomJS/Puppeteer/Tor) or HeadlessBrowserAPI (XSS - cross web site scripting function) and scrape the ensuing HTML content material

Version 2.3.6 Release Date 2021-08-30

Added the power to set featured pictures of posts from web site screenshots
Added the power to take away HTML content material (depart textual content solely) of XPath matched content material

Version 2.3.7 Release Date 2021-09-02

Added the power to set native storage objects when scraping web sites (these are just like cookies, their utilization is supported solely when utilizing headless browsers or HeadlessBrowserAPI at the side of the plugin)

Version 2.3.8 Release Date 2021-09-15

Added the power to set the WPML language to created posts

Version 2.3.9 Release Date 2021-10-19

WooCommerce product scraping associated enhancements

Version 2.4.0 Release Date 2022-02-28

Added help for creating WooCommerce product attributes and assign values to them from scraped knowledge

Version 2.4.1 Release Date 2022-03-05

Added the power to scrape picture galleries for WooCommerce merchandise

Version 2.4.1.1 Release Date 2022-03-21

Bugfix replace

Version 2.4.2 Release Date 2022-04-20

Fixed Google Translator downside attributable to a latest Google API replace

Version 2.5.0 Release Date 2022-05-01

Crawlomatic now can scrape search engine outcomes from Google and Bing - tutorial video: https://www.youtube.com/watch?v=h6fQeH9-X8c

Version 2.5.1 Release Date 2022-05-06

Added the power to scrape WooCommerce product variations from Shopify and different WooCommerce merchandise
Added the power to robotically detect product costs
Improved readability module
Fixes and enhancements

Version 2.5.2 Release Date 2022-06-14

Added the power to translate posts a 3rd time (performing like a Word Spinner, if the content material is translated again to the unique language

Version 2.5.3 Release Date 2022-06-23

Fixed WooCommerce value scraping associated problem

Version 2.5.4 Release Date 2022-09-12

Added the power to scrape hyperlinks from TXT information

Version 2.5.5 Release Date 2022-10-14

Major replace: put up/web page/product automated updating if the scraped supply URL modified

Version 2.5.6 Release Date 2022-11-30

Major replace: added help for Google News scraping

Version 2.5.7 Release Date 2023-01-05

Added a brand new capability to HeadlessBrowserAPI to click on on HTML parts by CSS selectors, enabling loading of Ajax content material and bypassing Captchas which require a click on

Version 2.5.8 Release Date 2023-01-17

Added product common value scraping function to WooCommerce merchandise - the common value is the value displayed earlier than the low cost is utilized. You can scrape this full value from the web sites or add/multiply the unique value to create it robotically

Version 2.5.9 Release Date 2023-02-10

Fixed Google News scraping after latest modifications

Version 2.6.0 Release Date 2023-03-13

Added extra DeepL languages
Multiline scraping expressions help added
Fixed all reported points

Version 2.6.0.1 Release Date 2023-04-13

Fixed reported bugs

Version 2.6.0.2 Release Date 2023-05-10

Improved scraper auto detection

Version 2.6.0.3 Release Date 2023-05-22

Fixed extra reported bugs

Version 2.6.0.4 Release Date 2023-06-13

Reworked backend, improved scraping velocity

Version 2.6.0.5 Release Date 2023-06-29

Scraped content material now higher matches supply web site styling

Version 2.6.0.6 Release Date 2023-07-28

Fixed Google Translate integration, working with newest modifications

Version 2.6.0.7 Release Date 2023-10-18

Fixed PHP 8.2 associated errors

Version 2.6.1 Release Date 2024-02-15

Fixed a difficulty with rule saving

Version 2.6.2 Release Date 2024-03-15

Visual selector repair for CSS problem taking place in some circumstances

Version 2.6.3 Release Date 2024-07-12

Bugfix launch
Purchase code verification now required for the plugin to operate

Version 2.6.4 Release Date 2024-10-26

Content filtering enhancements

Version 2.6.5 Release Date 2024-10-31

Added help for automated Magento product variation scraping

Are you already a buyer?

If you already purchased this and you’ve got tried it out, please contact me within the merchandise’s remark part and provides me suggestions, so I could make it a greater WordPress plugin!

WordPress 6.7 and PHP 8.4 Tested!

Disclaimer
Through this plugin you’ll be able to seize content material from varied web sites that doesn’t needed belong to you or which aren’t beneath your management. If you seize copyrighted materials with out the writer’s permission, the plugin’s developer doesn’t assume any duty for your actions. Also, the plugin’s developer has no management over the character, content material and availability of these websites.

Crawlomatic Multisite Scraper Post Generator Plugin for WordPress - 2

Do you want our work and wish extra of it?
Check out this MEGA plugin bundle.

LIVE PREVIEW BUY FOR $59