User Tools

Site Tools


Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
mission:log:2014:11:24:using-python-lxml-request-as-simple-scrape-robot-for-metrics-from-webpages [2015/01/09 09:42] – [The python solution] chronomission:log:2014:11:24:using-python-lxml-request-as-simple-scrape-robot-for-metrics-from-webpages [2015/07/08 08:07] – [In the beginning there was the copy] chrono
Line 22: Line 22:
 ===== In the beginning there was the copy ===== ===== In the beginning there was the copy =====
  
-Even if it appears unique and original to us, there always was some other inspiration/model to copy from. Most of what we do is based on other ideas and concepts laid out by other people before. And their ideas also evolved in the same matter. It's basically all about perception. I could present you the final python robot and say: "This is my awesome original work". And you might believe it, since it's slick, streamlined and very efficient. But that is just the current result. You wouldn't (and in most cases won't) see how crappy it began and how it evolved into its current form. But this is exactly what we're going to do today.+Even if it appears unique and original to us, there always was some other inspiration/model to copy from. Most of what we do is based on other ideas and concepts laid out by other people before. And their ideas also evolved in the same manner. It's basically all about perception. I could present you the final python robot and say: "This is my awesome original work". And you might believe it, since it's slick, streamlined and very efficient. But that is just the current result. You wouldn't (and in most cases won't) see how crappy it began and how it evolved into its current form. But this is exactly what we're going to do today.
  
 ===== The Problem ===== ===== The Problem =====
Line 97: Line 97:
 Infrequently upstream data changed and introduced some incomprehensible white space changes as a consequence and sometimes just delivered 999.9 values. Pain to maintain. And since most relevant values came as floats there was no other solution than to use bc for floating point math & comparisons, since bash can't do it.  Infrequently upstream data changed and introduced some incomprehensible white space changes as a consequence and sometimes just delivered 999.9 values. Pain to maintain. And since most relevant values came as floats there was no other solution than to use bc for floating point math & comparisons, since bash can't do it. 
  
-And finally, the data structure and shipping method to influxdb is more than questionable, it would never scale. Each metric produces another new HTTP request creating a lot of wasteful overhead. But at the point of writing I simply didn'know enough to make it better. +And finally, the data structure and shipping method to influxdb is more than questionable, it would never scale. Each metric produces another new HTTP request creating a lot of wasteful overhead. But at the point of writing I simply didn'knew enough to make it better. 
  
 ===== The python solution ===== ===== The python solution =====