package to turn html into plaintext?
i want to try turning html pages into plaintext in the terminal. any ideas?
12 Replies
i could just strip all the tags but i want to see if there's an easier way to do what i want
also positioning would be cool...
So you just want the text in all the elements?
Start at document.body. Take its children and iterate over them. If it’s a string then append it to whatever, if it’s a child then take their children and repeat the above process.
If you just want the html as a string then document.documentElement.outerHTML
i'm doing this in the terminal so i just used
fetch()
and used Request.body
to get the html
however, i want to get the positioning and stuff so i can replicate the page on the terminalWhat do you mean you’re doing it in the terminal? Like you’re viewing the raw html in the terminal or executing Deno from the terminal? Because fetch doesn’t exist in the terminal.
i am building a TUI, and i'm taking the
fetch()
results and, at the moment, just displaying the raw html.
however, i'd like to actually "render" the html, like a true TUITUI?
terminal user interface
Oh command line interface. I don’t know of a way to display graphical stuff in a terminal. I don’t think that’s possible.
Text-based user interface
In computing, text-based user interfaces (TUI) (alternately terminal user interfaces, to reflect a dependence upon the properties of computer terminals and not just text), is a retronym describing a type of user interface (UI) common as an early form of human–computer interaction, before the advent of graphical user interfaces (GUIs). Like GUIs,...
Okay? So how are you imagining the html to look in the terminal? Since everything in the terminal is made up of characters.
like this