Class MozillaReadabilityTransformer

A transformer that uses the Mozilla Readability library to extract the main content from a web page.

Example

const loader = new CheerioWebBaseLoader("https://example.com/article");
const docs = await loader.load();

const splitter = new RecursiveCharacterTextSplitter({
 maxCharacterCount: 5000,
});
const transformer = new MozillaReadabilityTransformer();

// The sequence processes the loaded documents through the splitter and then the transformer.
const sequence = splitter.pipe(transformer);

// Invoke the sequence to transform the documents into a more readable format.
const newDocuments = await sequence.invoke(docs);

console.log(newDocuments);

Hierarchy (view full)

Toolkit
- MozillaReadabilityTransformer

Index

Constructors

constructor

Properties

options

Constructors

constructor

new MozillaReadabilityTransformer(options?): MozillaReadabilityTransformer
Parameters
- options: Options = {}
Returns MozillaReadabilityTransformer
Overrides MappingDocumentTransformer.constructor
- Defined in libs/langchain-community/src/document_transformers/mozilla_readability.ts:36

Properties

`Protected` options

options: Options = {}

Class MozillaReadabilityTransformer

Example

Hierarchy (view full)

Index

Constructors

Properties

Constructors

constructor

Parameters

options: Options = {}

Returns MozillaReadabilityTransformer

Properties

`Protected` options

Settings

Member Visibility

Theme

On This Page

Class MozillaReadabilityTransformer

Example

Hierarchy (view full)

Index

Constructors

Properties

Constructors

constructor

Parameters

options: Options = {}

Returns MozillaReadabilityTransformer

Properties

Protected options

Settings

Member Visibility

Theme

On This Page

`Protected` options