HN Offline: Apple Prototypes and Corporate Secrets Are for Sale Online

Apple Prototypes and Corporate Secrets Are for Sale Online–If You Know Where

mandatory | 118 points | 11mon ago | www.wired.com

krackers|11mon ago

>chaining together a dozen dilapidated second-generation iPhone SEs and harnessing Apple's Live Text optical character-recognition feature to find possible inventory tags

This is the second time I've read about an iPhone OCR rack https://findthatmeme.com/blog/2023/01/08/image-stacks-and-ip...

Is this still state of the art in terms of local OCR?

mandatory|11mon ago

It's just because I did this talk and made FindThatMeme :) so not a popular method, just what I used to do large scale OCR.

krackers|11mon ago

Oh I completely missed that you're actually the same guy!

talldayo|11mon ago

I think Tesseract is the smarter/faster/less obnoxious choice if you're not trying to parse weird meme text like the blog is doing. There's almost certainly a better paid option available in our enlightened AI age, but I don't even think you'd need AI for this use-case.

oefrha|11mon ago

Last time I used tesseract (a year ago?) it’s still pretty useless if your text isn’t on a clean background. It doesn’t even come close to Apple’s proprietary on-device OCR.

ranger_danger|11mon ago

There is a whole page on their site dedicated to methods for improving the accuracy: https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html

I think most frontends to tesseract employ a lot of these methods and maybe more... but trying to use tesseract directly can indeed be difficult without extra processing of the image first.

oefrha|11mon ago

I know, I tried many things with the photo collection I was working with, including advice from that very page, generally to relatively poor results. (I ended using Apple’s framework on macOS.) The point is tesseract is definitely not “smarter” in any way, at best it’s on par with Apple’s OCR when you hand it very clean text.

kergonath|11mon ago

The Apple framework is much, much better than Tesseract, and quicker as well. It is really good. Of course if you don’t need on-device processing, then there are cloud services that are better.

mike_d|11mon ago

> I think Tesseract is the smarter/faster/less obnoxious choice [...] There's almost certainly a better paid option available in our enlightened AI age

It would have cost $375,000 to use cloud OCR for this project. Mandatory is absolutely a baller, but not crazy enough to spend that kind of money on the project.

If you can get Tesseract to generate comparable results with sub-optimal images from eBay listings, I'd love to know more.

egorfine|11mon ago

I have tried to use Tesseract to extract serial numbers from photos of iPhone boxes and had a 100% failure rate.

I have then employed a multimodal LLM and had a 100% success rate.

j45|11mon ago

I have seen some models mix LLM with ocr to improve both.

Considering what apps like Notes can do low key on iOS… I wouldn’t be surprised if there would exist more capability.

Iirc, Apple was holding back improvements to Siri and other techs.

epakai|11mon ago

Some of these developer devices get 'destroyed' and sold as scrap. dosdude1 has restored some of these kinds of devices to working order. There's pretty neat video of the restorations:

ARM Apple Silicon Developer Transition Kit: https://www.youtube.com/watch?v=reQq8fx4D0Q iPod Touch dev board: https://www.youtube.com/watch?v=qLCt6oHPTQM

The PCB repair technique for the DTK is pretty cool on its own.

userbinator|11mon ago

That first video appeared on HN with some discussion: https://news.ycombinator.com/item?id=40422359