Transformations
A transformation is something that takes a list of nodes as an input, and returns a list of nodes. Each component that implements the Transformation class has both a transform
definition responsible for transforming the nodes.
Currently, the following components are Transformation objects:
Usage Pattern
While transformations are best used with with an IngestionPipeline, they can also be used directly.
import { SentenceSplitter, TitleExtractor, Document } from "llamaindex";
async function main() {
let nodes = new SentenceSplitter().getNodesFromDocuments([
new Document({ text: "I am 10 years old. John is 20 years old." }),
]);
const titleExtractor = new TitleExtractor();
nodes = await titleExtractor.transform(nodes);
for (const node of nodes) {
console.log(node.getContent(MetadataMode.NONE));
}
}
main().catch(console.error);
Custom Transformations
You can implement any transformation yourself by implementing the TransformComponent
.
The following custom transformation will remove any special characters or punctutation in text.
import { TransformComponent, TextNode } from "llamaindex";
export class RemoveSpecialCharacters extends TransformComponent {
async transform(nodes: TextNode[]): Promise<TextNode[]> {
for (const node of nodes) {
node.text = node.text.replace(/[^\w\s]/gi, "");
}
return nodes;
}
}
These can then be used directly or in any IngestionPipeline.
import { IngestionPipeline, Document } from "llamaindex";
async function main() {
const pipeline = new IngestionPipeline({
transformations: [new RemoveSpecialCharacters()],
});
const nodes = await pipeline.run({
documents: [
new Document({ text: "I am 10 years old. John is 20 years old." }),
],
});
for (const node of nodes) {
console.log(node.getContent(MetadataMode.NONE));
}
}
main().catch(console.error);