Copying the contents of w3m-output (terminal output) from beginning to end

Rosika · January 26, 2020, 1:23pm

Hi altogether,

I use the w3m-browser on a regular basis and am really satisfied with it.
(w3m download | SourceForge.net and http://w3m.sourceforge.net/ )
It has quite a lot of advantages: distraction-free reading, no unnecessary loading of pics, gifs and ads (thus also much safer), bandwidth saving…

There´s just on thing I cannot work out.

Using a TTS (Text-to-speech) script I wrote for having the text read out to me (whenever I´m too lazy for reading…) I need to copy the output of the (whole) page to my clipboard.

By typing “m” the mouse-pointer turns into “selection” and I can copy the marked text.

But:

Generally the output of a page is larger than a single “page” of the terminal. So I´m forced to scroll down.
By scrolling however the mark gets lost.

So basically it´s impossible to mark the first word of the page, then scroll down a few times and then mark the last word with SHIFT + left-click of the mouse. This would work in leafpad, gedit, browsers however.

So I have to copy the content of a single page, copy it to my TTS-script, then scroll down one page, repeat the procedure and so on.

Does anybody know a solution to this problem?

Thanks a lot in advance.

Geetings.
Rosika

4dandl4 · January 26, 2020, 11:22pm

Hi Rosika, it has been awhile. Daniel from DCT.

01101111 · January 27, 2020, 4:16am

these sound like they might not work in that particular browser, but i thought i would check just in case. have you tried ctrl + a? in firefox it grabs way too much (even sidebar text), but it sounds like w3m might not have the extra text or fields. the other one that also grabs too much, but might not in your situation is that i can highlight the first word of text on a wikipedia page with left click and mousing over the word (like what you describe with m), scroll to the bottom and then ctrl + shift + left mouse click highlights everything in between.

Rosika · January 27, 2020, 1:14pm

@4dandl4:
Hi Daniel. Nice to hear from you again. I hope you´re well and everything´s O.K.

Many greetings.
Rosika

Rosika · January 27, 2020, 1:21pm

@01101111:
Hi and thanks for your answer.

I didn´t even know about “ctrl + a” I have to admit. Shame on me.
I tried it and - as you already suggested - it didn´t work in w3m. It had no effect whatsoever, i.e. it didn´t mark anything. On chromium however this command works as you described.

I already tried you second proposal. The thing is: that one works - but only as long as I remain on the currently displayed page. When marking the first word and scrolling down to the end of the 2nd, 3rd or nth page the first mark gets lost.

As before this works in chromium, firefox, gedit etc.

It seems there´s not much that can be done to solve the problem when using w3m.

Thanks anyway for your help.

Greetings.
Rosika

Akito · January 27, 2020, 2:31pm

Well, turns out you were 5 minutes ahead of me. Was about to suggest to ask the people directly involved with the project.

Though, scrolling now through the topics, it seems like this is one of the ghostiest ghost towns… You don’t even have to flip the page to see threads from 2003.

Maybe you need to wait even 17 years and couting for an answer.

Sadly the last update was in 2013-04-26 .

For the worst case scenario, I tried to find an alternative to w3m but it seems like pretty much every respected CLI web browser and its derivates is dying/dead… In today’s web world this kind of tool has just become more and more unfitting for its purpose. If you use w3m just because of TTS, you might find alternative software, that does the job better regarding TTS on web pages. If you use w3m generally, maybe there is a scripting way of handling things like this guy in his example with another browser:
https://www.reddit.com/r/commandline/comments/4th6b5/a_commandline_web_browser_that_doesnt_suck/d5i0at6?utm_source=share&utm_medium=web2x

Theoretically, if scripting is sufficiently supported, you could create a script that enables “TTS mode” on a web page by pressing a specific key combination. In this TTS mode, you e.g. could reformat the website to display a lot more lines on a single terminal window, which would lead to less interruptions during TTS, as the issue you described, originally.

Rosika · January 27, 2020, 2:50pm

@Akito:
Hi and thank you.

Yes, I was going to tell you all that I tried the direct way for a possible solution but you noticed already.
Well, I doesn´t look promising as you pointed out.
Yet by now there have been 16 views. So at least the topic seems to be watched.

If I get an answer there I´ll let you know.

Man greetings.
Rosika

P.S.:
I just saw your post is longer than I originally noticed.
Thanks for the link. I´ll look at it.

Akito · January 27, 2020, 2:56pm

Could you briefly explain if it would be okay to you to use TTS out of the browser in the worst case scenario?
What if there was a dedicated tool that does it better than w3m ever could?

Because if an external program would be acceptable, I am sure there is a way to let the browser open a page in an external program which then automatically starts reading. So you keep your business with w3m as usual, but you let some of the background work to another, better program that is made for TTS’ing web pages.

Rosika · January 27, 2020, 3:03pm

@Akito:

Yes I guess that would be O.K. But I think that´s what I´m currently doing as well. I mean my TTS-script is a “standalone”-programme. I just feed it the contents of my clipboard.

Interesting. That might be an option as well.
Thanks a lot.
Rosika

01101111 · January 27, 2020, 9:16pm

well, you taught me ll on linuxquestions so it all evens out i think

Rosika · January 28, 2020, 2:57pm

@01101111:
That´s really nice of you. Thanks.

Rosika · January 30, 2020, 1:13pm

Hi all,

in the meantime I came up with some sort of workaround (not a solution):
As w3m provides its output in the terminal it´s easy to redirect it to a text-file, like so:

w3m "https://distrowatch.com/" | tee output.txt

Yet this seems to be just a half-hearted solution as that one is only viable if no following of links are needed as the terminal-output is just plain text.
Of course “tee” can be replaced by redirection “>” if double output should be avoided.

Basically what it comes down to is the following:
If I know beforehand that I want to use TTS I can use this command as it simplifies the process of “marking all” and thus copying everything in a single step.

Greetings.
Rosika