Options for the Readability library. All options are optional.
debug
(boolean, defaultfalse
): whether to enable logging.maxElemsToParse
(number, default0
i.e. no limit): the maximum number of elements to parse.nbTopCandidates
(number, default5
): the number of top candidates to consider when analysing how tight the competition is among candidates.charThreshold
(number, default500
): the number of characters an article must have in order to return a result.classesToPreserve
(array): a set of classes to preserve on HTML elements when thekeepClasses
options is set tofalse
.keepClasses
(boolean, defaultfalse
): whether to preserve all classes on HTML elements. When set tofalse
only classes specified in theclassesToPreserve
array are kept.disableJSONLD
(boolean, defaultfalse
): when extracting page metadata, cheer-reader gives precedence to Schema.org fields specified in the JSON-LD format. Set this option totrue
to skip JSON-LD parsing.serializer
(function, default$el => $el.html()
) controls how thecontent
property returned by theparse()
method is produced from the root DOM element. It may be useful to specify theserializer
as the identity function ($el => $el
) to obtain a cheerio element instead of a string forcontent
if you plan to process it further.allowedVideoRegex
(RegExp, defaultundefined
): a regular expression that matches video URLs that should be allowed to be included in the article content. Ifundefined
, the default regex is applied.linkDensityModifier
(number, default0
): a number that is added to the base link density threshold during the shadiness checks. This can be used to penalize nodes with a high link density or vice versa.extraction
(boolean, defaulttrue
): Some libraries are only interested on the metadata and don't want to pay the price of a full extraction. When you enable this option thecontent
,textContent
,length
andexcerpt
will benull
.
serializer: (el: Cheerio<Element> | null) => string | null