
Hi, Whenever I write a program which has to interface with the web (scraping, POSTing, whatever), I never know how to properly test it. What I have been doing up to date is fetching some pages ahead of time, saving locally and running my parsers or whatever it is I'm coding at the moment against that. The problem with this approach is that we can't test a whole lot: if we have a crawler, how do we test it it goes to the next page properly? Testing things like logging in and such seems close to impossible, we can only test if we are making a good POST. Let's stick to a crawler example. How would you test that it follows links? Do people set up local webservers with few dummy pages they download? Do you just inspect that GET and POST ‘look’ correct? Assume that we don't own the sites so we can't let the program run tests in the wild: page content might change (parser tests fail), APIs might change (unexpected stuff back), our account might be locked (guess they didn't like us logging in 20 times in last hour during tests) &c. Of course there is nothing which can prevent upstream changes but I'm wondering how we can test the more-or-less static stuff without calling out into the world. -- Mateusz K.