Skip to content
View dcalano's full-sized avatar
👨‍💻
Grindin' all my life
👨‍💻
Grindin' all my life

Organizations

@oduwsdl

Block or report dcalano

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Various Jupyter notebooks about Common Crawl data

Jupyter Notebook 49 9 Updated Jun 2, 2022

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

Scala 146 19 Updated Sep 19, 2024

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Scala 139 33 Updated Feb 27, 2024

Various examples of notebooks for working with web archives with the Archives Unleashed Toolkit, and derivatives generated by the Archives Unleashed Toolkit.

Jupyter Notebook 24 4 Updated Dec 5, 2022

Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)

Python 154 28 Updated Aug 27, 2020

Streaming WARC/ARC library for fast web archive IO

Python 391 58 Updated Dec 10, 2024

Java library for reading and writing WARC files with a typed API

Java 49 10 Updated Dec 19, 2024

A list of things related to software, literature, and other content for 🕣 Memento

94 8 Updated May 29, 2024

An email and SMTP testing tool with API for developers

Go 6,373 158 Updated Jan 1, 2025

smtp4dev - the fake smtp email server for development and testing

C# 3,253 350 Updated Dec 30, 2024

A command-line benchmarking tool

Rust 23,485 378 Updated Jan 5, 2025

View .obj files in the terminal 🦀

Rust 234 5 Updated Nov 18, 2024

🌍 Discover our global repository of countries, states, and cities! 🏙️ Get comprehensive data in JSON, SQL, PSQL, XML, YAML, and CSV formats. Access ISO2, ISO3 codes, country code, capital, native l…

PHP 7,684 2,640 Updated Jan 4, 2025

Command-line tool and Rust library for handling Web ARChive (WARC) files

Rust 10 1 Updated Nov 14, 2024

ETL, Analytics, Versioning for Unstructured Data

Python 2,164 97 Updated Jan 5, 2025

Interactive roadmaps, guides and other educational content to help developers grow in their careers.

TypeScript 303,778 39,776 Updated Jan 5, 2025

Render source code in 3D, for macOS and iOS.

Swift 188 1 Updated Dec 1, 2024

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 5,418 425 Updated Jan 5, 2025

Scripts to simplify setting up a Windows developer box

PowerShell 1,773 394 Updated Feb 2, 2024

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

Python 10,336 1,179 Updated Jun 25, 2024

Undetected Web-Scraping & Seamless HTML Parsing in Python!

Python 184 9 Updated Oct 24, 2024

Converts some webnovels to epub format

TypeScript 761 17 Updated Jan 1, 2025

🪄 Create rich visualizations with AI

TypeScript 1,432 87 Updated Jan 2, 2025
TypeScript 3,280 313 Updated Nov 15, 2024

RFHunter is a device to find hidden Cameras at AirBNBs

C++ 1,146 37 Updated Oct 31, 2024

HTTrack Website Copier, copy websites to your computer (Official repository)

C 3,673 665 Updated Aug 13, 2024

A terminal music player.

C 1,124 33 Updated Dec 29, 2024

Experience timeless melodies with a music player that blends classic design with modern technology.

Dart 788 22 Updated Jan 5, 2025

An open-source RAG-based tool for chatting with your documents.

Python 19,447 1,497 Updated Jan 5, 2025
Next