For anyone that cares, this is what our crashes look like.
TOP 20 url times are: 39432223* Safari/601.3.9, search/names?&search=true&keyword=S%25%25%25 39371188* Safari/601.3.9, search/names?&search=true&keyword=S%25%25%25 39310161* Safari/601.3.9, search/names?&search=true&keyword=S%25%25%25 39300736* Safari/601.3.9, search/names?&search=true&keyword=S%25 39041255* Safari/537.36, taxa/LYCAENIDAE/complete 38945840* Safari/537.36, taxa/LYCAENIDAE/complete 38930099* Applebot, taxa/Gymnothorax_pictus/checklist 6006648* Safari/537.36, taxa/LYCAENIDAE/complete 5975658* Safari/537.36, taxa/LYCAENIDAE/complete 5914978* Safari/537.36, taxa/LYCAENIDAE/complete 5914961* Baiduspider/2.0, taxa/Talaurinus_prypnoides/checklist 5863731* Baiduspider/2.0, taxa/Platyzosteria_jungi/checklist 5838122* Baiduspider/2.0, taxa/Pterohelaeus_litigiosus/checklist 5643696 Applebot/0.1, taxa/Eulecanium/checklist 5582189 Applebot, taxa/0c1c84ef-4ad1-403f-a7f7-f45447a0372a 5527142 Yahoo! Slurp, taxa/908440b7-da2f-4840-a81f-b5b3b5a2c14e 5487277 Baiduspider, taxa/Ropalidia_plebeiana/statistics 5430266 Yahoo! Slurp, taxa/Eumelea%20duponchelii 5025511* Baiduspider/2.0, taxa/Pseudostrongyluris_polychrus/statistics 4967732 Firefox/24.0, taxa/Siganus/checklist java.lang.OutOfMemoryError: GC overhead limit exceeded
The numbers on the left are the request duration in milliseconds. An asterisk indicates that the request is still ongoing. As you can see, these requests are taking 10 hours to return. Obviously, I need some sort of watchdog to interrupt threads.
The entries are sorted in order of duration, so the oldest requests are sorted to the top. This tells the story. Someone, whose IP address I am not repeating here, searched for ‘S%%%’. Then went “hmm, it’s not coming back”, they hit the search button twice more, then got rid of the extra percentage signs and just searched for ‘S%’. And then, I suppose, concluded that AFD “doesn’t work” and went away.
To fix this, there’s a validation rule on the basic name search path through the app: if you use a wildcard, you must also have three non-wildcard characters. Its a stronger version of the previous validation rule, which checked only for searches that were only wildcards.
The obvious question is, “why didn’t you think about this in the first place?” The answer is that we chose to make it permissive because we have legit users who really do want all the names in AVES and should be able to get them. Now I’m putting out fires, adding limits of various kinds on a case-by-case basis. It’s bitsy. Far from ideal.
What to do, what to do.
- Google “adding a watchdog timer to a webapp”. The difficulty is that web applications are not supposed to start their own threads. Tomcat will probably let me do it, but it shouldn’t. Need to find the proper way to go about this.