Content
The common keys required for all sources are:
- "name" - String - The name of the source presented in weyd's links page
- "language" - Array of String - This is not currently processed by weyd, but was put here based on other scraper packages using this. It will be used in the future, so it would be wise to add it now. Once implemented, weyd will use the ISO standard language codes.
- "domains" - Array of String - This is not currently processed by weyd, but in the future it will be used to substitute into the "base_url" domain name to search backups in case the primary domain is down. These should just be FQDN without https and without port numbers.
- "base_url" - String - The base URL that is prepended to search requests. This must be HTTPS and must contain a FQDN to a server with a valid SSL certificate issued by a trusted CA.
-
"search_url_format_episode" - JSON Object - This must be present even if the source being scraped does not contain TV shows.
- "string_format" - String - Contains a Java String.format() template for String replacement. If the source being scraped does not contain TV shows, or you don't want weyd to search for TV shows, then leave this an empty string.
-
"replacement" - Array of String - This must contain equal number of items as the "string_format" contains for replacement. Each item must be in the same order as they will be replaced, and they must be the correct data type.
- Possible values:
- "title" - Replaced by "The Title Exactly How It Is" - Spaces get replaced with + to URL encode
- "title_lower" - Replaced by "the title lowercase with spaces" - Spaces get replaced with + to URL encode
- "title_lower_dash" - Replaced by "the-title-lowercase-with-dashes-for-spaces"
- "year_text" - Replaced by the String version of year - "2001"
- "year_int" - Replaced by the Integer of year - 2015
- "season_text" - Replaced by the String version of season - "1", "15"
- "season_int" - Replaced by the Integer version of season - 1, 5, 20
- "episode_text" - Replaced by the String version of episode - "3", "12"
- "episode_int" - Replaced by the Integer version of episode - 2, 4, 8
- Possible values:
-
"search_url_format_movie" - JSON Object - This must be present even if the source being scraped does not contain Movies.
- "string_format" - String - Contains a Java String.format() template for String replacement. If the source being scraped does not contain Movies, or you don't want weyd to search for Movies, then leave this an empty string.
-
"replacement" - Array of String - This must contain equal number of items as the "string_format" contains for replacement. Each item must be in the same order as they will be replaced, and they must be the correct data type.
- Possible values:
- "title" - Replaced by "The Title Exactly How It Is" - Spaces get replaced with + to URL encode
- "title_lower" - Replaced by "the title lowercase with spaces" - Spaces get replaced with + to URL encode
- "title_lower_dash" - Replaced by "the-title-lowercase-with-dashes-for-spaces"
- "year_text" - Replaced by the String version of year - "2001"
- "year_int" - Replaced by the Integer of year - 2015
- Possible values:
-
search_url_format_season_pack" - JSON Object (optional - only available if "is_torrent":true) - The presence of this JSON Object indicates that this source can be searched for Series Packs. Be careful with this option because it will slow down the searching of your script. It's best to limit this to API sources and HTML sources that have "links_on_first_page": true.
- "string_format" - String - Contains a Java String.format() template for String replacement. Use only 1 %s in the position where weyd will insert various combinations of the title for searching Series Packs. Do not include any %d for year, season, or episode.
-
"replacement" - Array of String - This must contain equal number of items as the "string_format" contains for replacement. Each item must be in the same order as they will be replaced, and they must be the correct data type. The last %s will be used by weyd to insert various combinations of the title for searching Series Packs. Do not include any year, season, or episode replacements.
- Possible values:
- "title" - Replaced by "The Title Exactly How It Is" - Spaces get replaced with + to URL encode
- "title_lower" - Replaced by "the title lowercase with spaces" - Spaces get replaced with + to URL encode
- "title_lower_dash" - Replaced by "the-title-lowercase-with-dashes-for-spaces"
- Possible values:
- "is_torrent" - Boolean - Does this source contain torrents (If this is true, then "is_direct" is ignored).
- "is_direct" - Boolean - Does this source contain direct downloads.
- "name_delete_filter" - Array of String - An array of String and each String will be removed from the scraped title.
Example
{
"name": "weyd",
"language": ["en"],
"domains": ["weyd.app"],
"base_url": "https://weyd.app",
"search_url_format_episode": {
"string_format": "/search?query=%s+s%02de%02d&type=tv",
"replacement": [
"title_lower",
"season_int",
"episode_int"
]
},
"search_url_format_movie": {
"string_format": "/search?query=%s+%04d&&type=movie",
"replacement": [
"title_lower",
"year_int"
]
},
"search_url_format_season_pack": {
"string_format": "/%.1s/%s",
"replacement": [
"title_lower"
]
},
"is_torrent": true,
"is_direct": false
}
Source types
There are two types of sources, and each has a different layout. A source can either be an API that returns JSON, or it can be a HTML based website.
Regardless of which type you're trying to access, all URLs must be HTTPS to a FQDN with a valid SSL certificate. You will not be able to use raw IPs or weyd cannot access the links.
To distinguish a source as API, you must include a JSON Object with the key "api"
{
"api": {
"key": "value",
"key": "value",
"key": "value"
}
}
If this key ("api") exists, all other directives finding values on the page will be ignored.
Without the "api" key, the default is to handle this as a HTML based website.