Blog on PHP Web Development by Christian Olear (Otsch)

Crwlr Recipes: Using a Crawler for Website Error Detection and Cache Warming

crwlr.software

2025-01-20

Have you ever deployed your website or web app, only to discover hours later that you’ve introduced bugs or broken links? Or do you clear the cache with every deploy, leaving the first users to experience slow performance? In this guide, you’ll learn how to use a crawler to automatically detect errors and warm the cache, ensuring your site runs smoothly after every deployment.

crwlr.software

2024-06-05

Version 1.8 of the crwlr/crawler package is out, introducing key new functions that will replace existing ones in v2.0. Addressing previous issues with composing crawling result data, this update provides a solution that enhances performance, minimizes memory usage further, and simplifies the process, making it more intuitive and easier to understand.

A Quickstart Tutorial on PHP Generators

crwlr.software

2024-06-05

Since working with generators can be a bit tricky if you're new to them, this post offers an intro on how to use them and highlights common pitfalls to avoid.

otsch.codes

2023-11-24

Abstract classes cannot be instantiated directly, posing a challenge when testing functionality implemented within the abstract class itself. In this article, I will share my approach to addressing this issue.

crwlr.software

2023-11-16

This is the first article of our "Crwlr Recipes" series, providing a collection of thoroughly explained code examples for specific crawling and scraping use-cases. This first article describes how you can crawl any website fully (all pages) and extract the data of schema.org structured data objects from all its pages, with just a few lines of code.

otsch.codes

2023-03-01

My friend Florian Bauer recently posted an article saying that PHP needs a rebranding and that he would rename it to HypeScript. Here's my two cents on that subject.

crwlr.software

2023-02-08

I'm very proud to announce that version 1.0 of the crawler package is finally released. This article gives you an overview of why you should use this library for your web crawling and scraping jobs.

What's new in crwlr / crawler v0.6?

crwlr.software

2022-10-03

Version 0.6 is probably the biggest update so far with a lot of new features and steps from crawling whole websites, over sitemaps to extracting metadata and schema.org structured data from HTML. Here is an overview of all the new stuff.

What's new in crwlr / crawler v0.5?

crwlr.software

2022-09-03

We're already at v0.5 of the crawler package and this version comes with a lot of new features and improvements. Here's a quick overview of what's new.

crwlr.software

2022-06-02

There is a new package in town called query-string. It allows to create, access and manipulate query strings for HTTP requests in a very convenient way. Here's a quick overview of what you can do with it and also how it can be used via the url package.

What's new in crwlr / crawler v0.4

crwlr.software

2022-05-10

Last friday version 0.4 of the crawler package was released with some pretty useful improvements. Read what's shipped with this new minor update.

crwlr.software

2022-04-30

There are already two new 0.x versions of the crawler package. Here a quick summary of what's new in versions 0.2 and 0.3.

Release of crwlr / crawler v0.1.0

crwlr.software

2022-04-18

After months of hard work, today I'm finally releasing the first version (v0.1.0) of the crwlr / crawler package. Here some information on what it is, its state and current and future features.

otsch.codes

2022-02-01

If you're just starting out in web development, then one very fundamental thing to learn on your journey will be HTTP. I learnt it bit by bit over the course of years, probably like many other Developers. Learning the basics in the very beginning will help you to (faster) identify, understand and solve many problems in the projects you will build. In this post I'll start with an overview.

otsch.codes

2022-01-20

For a few weeks I'm unemployed now and starting to build my own SaaS project. A very obvious change to my job last year is that I'm alone now and not solely responsible for coding anymore. Here some thoughts on what I think you should focus on and how to organize and juggle it all.

crwlr.software

2022-01-19

Homograph attacks are using internationalized domain names (IDN) for malicious links including domains that look like trusted organizations. You can use the crwlr Url class to detect and monitor urls containing IDNs in your user's input.

otsch.codes

2021-12-20

Today I am celebrating that I have finally quit my job and decided to start my own business. I'll try to document my journey and my thoughts on that topic for anyone who is interested. I don't know if it will be successful or fail, but at least you will then know one way how not to do it. Let me start by telling my personal story that led me to this point.

Why I start crwlr.software

crwlr.software

2018-04-15

This is just a short introduction to what crwlr.software is and will become in the future and why you may like it.