OK, I think the bookmarklet/text scraper needs a little tweeking. I just tried P...

LinaLauneBaer · on April 17, 2012

The good thing is that Instapaper's extraction process can be modified by anyone who has an account:

So if something is not working properly you can improve Instapaper on a site by site basis.

camiller · on April 17, 2012

I had no idea you could do that! That's cooler than the other side of the pillow!

vincentmac · on April 17, 2012

I just tried adding that Microsoft article via the Pocket Chrome extension (https://chrome.google.com/webstore/detail/niloccemoadcdkdjli...) and had no problems with it identifying the content correctly.

camiller · on April 17, 2012

I haven't tried the Chrome(or Firefox) extension(s), just the bookmarklet javascript that Pocket provides here: http://getpocket.com/welcome?b=Bookmarklet

I'll have to give the extensions a try when I get a chance.

TazeTSchnitzel · on April 17, 2012

Some web devs (like me! :D) are lazy and make the <title> tag always just the website's name.

mikeklaas · on April 17, 2012

As someone who writes an html content extractor, you have no idea how much I hate you.

TazeTSchnitzel · on April 18, 2012

Oh I hate myself too. I can't distinguish the tabs.

Tyrannosaurs · on April 17, 2012

The first H1 tag might be better (and would also work in this case).

camiller · on April 17, 2012

The first H1 tag in the computerworlduk article says "Blogs", the second has the title.