Show HN: VisaWhen – Data on US visa issuance backlogs
Heya! Not the usual sort of thing to be posted here, but I wanted to show off what I made yesterday. Here's a sample page about H1-B visas issued in Bogota: <https://visawhen.com/consulates/bogota/h1b> The code is source-available (not open source) at <https://github.com/underyx/visawhen>. It's my first time choosing a source-available license over MIT, mainly out of fear of existing immigration startups just gobbling this data and code up; frankly I didn't think the implications through though, I just threw a safe license on there. The way the project works is: - Use requests-html to find publicly available PDFs from government pages - Use camelot to OCR the PDFs and extract data tables from them - Since the previous step takes crazy long for my tastes (around 8000 pages at around 5 seconds each) I've used dask to split the work into chunks and parallel-process them across my laptop's CPUs. - Do data cle
DeepCamp AI