You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is how I've implemented a LD+JSON parser in my local project:
(html: string): $ReadOnlyArray<Object>=>{
const dom=newJSDOM(html);constnodes=Object.values(dom.window.document.querySelectorAll('script[type="application/ld+json"]'));returnnodes.map((node)=>{if(!node||typeofnode.innerHTML!=='string'){thrownewTypeError('Unexpected content.');}letbody=node.innerHTML;debug('body',body);// Some websites (e.g. Empire) have JSON that includes new-lines, i.e. invalid JSON.body=body.replace(/\n/g,'');// Some website (e.g. Variety) have JSON that is surrounded in CDATA comments, e.g.// https://gist.github.com/gajus/4a2653b4a5235ccebedc44467a2896f2body=body.slice(body.indexOf('{'),body.lastIndexOf('}')+1);returnJSON.parse(body);});};
Thus far it works with all the sites I have been testing.
The text was updated successfully, but these errors were encountered:
web-auto-extractor/src/parsers/jsonld-parser.js
Lines 8 to 22 in 2d15ce4
The current JSON-LD parser assumes a perfect world scenario.
;
at the end of the JSON.Here is how I've implemented a LD+JSON parser in my local project:
Thus far it works with all the sites I have been testing.
The text was updated successfully, but these errors were encountered: