rzh9b
rzh9b2y ago

package to turn html into plaintext?

i want to try turning html pages into plaintext in the terminal. any ideas?
12 Replies
rzh9b
rzh9b2y ago
i could just strip all the tags but i want to see if there's an easier way to do what i want also positioning would be cool...
Doctor 🤖
Doctor 🤖2y ago
So you just want the text in all the elements? Start at document.body. Take its children and iterate over them. If it’s a string then append it to whatever, if it’s a child then take their children and repeat the above process. If you just want the html as a string then document.documentElement.outerHTML
rzh9b
rzh9b2y ago
i'm doing this in the terminal so i just used fetch() and used Request.body to get the html however, i want to get the positioning and stuff so i can replicate the page on the terminal
Doctor 🤖
Doctor 🤖2y ago
What do you mean you’re doing it in the terminal? Like you’re viewing the raw html in the terminal or executing Deno from the terminal? Because fetch doesn’t exist in the terminal.
rzh9b
rzh9b2y ago
i am building a TUI, and i'm taking the fetch() results and, at the moment, just displaying the raw html. however, i'd like to actually "render" the html, like a true TUI
Doctor 🤖
Doctor 🤖2y ago
TUI?
rzh9b
rzh9b2y ago
terminal user interface
Doctor 🤖
Doctor 🤖2y ago
Oh command line interface. I don’t know of a way to display graphical stuff in a terminal. I don’t think that’s possible.
rzh9b
rzh9b2y ago
Text-based user interface
In computing, text-based user interfaces (TUI) (alternately terminal user interfaces, to reflect a dependence upon the properties of computer terminals and not just text), is a retronym describing a type of user interface (UI) common as an early form of human–computer interaction, before the advent of graphical user interfaces (GUIs). Like GUIs,...
Doctor 🤖
Doctor 🤖2y ago
Okay? So how are you imagining the html to look in the terminal? Since everything in the terminal is made up of characters.
rzh9b
rzh9b2y ago
rzh9b
rzh9b2y ago
like this