Difference between revisions of "Stofradar"

From RevSpace
Jump to navigation Jump to search
(Introduction)
(Introduction)
(27 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
   {{Project
 
   {{Project
 
   |Name=Stofradar
 
   |Name=Stofradar
   |Picture=Stofradar3.png
+
   |Picture=stofradar.png
   |Omschrijving=Visualizing atmospheric particulate matter concentrations on a map
+
   |Omschrijving=Visualizing airborne particulate matter concentrations on a map
 
   |Status=Completed
 
   |Status=Completed
 
   |Contact=bertrik
 
   |Contact=bertrik
Line 18: Line 18:
 
The website [https://sensor.community sensor.community] is an initiative to allow citizens to participate in measuring atmospheric particulate matter concentration using an inexpensive and [https://sensor.community/nl/sensor-bouwen/ easy to build sensor].
 
The website [https://sensor.community sensor.community] is an initiative to allow citizens to participate in measuring atmospheric particulate matter concentration using an inexpensive and [https://sensor.community/nl/sensor-bouwen/ easy to build sensor].
 
They collect this data, calculate 5 minute and daily averages and publish it again as open data.
 
They collect this data, calculate 5 minute and daily averages and publish it again as open data.
The total number of sensors is > 6000 worldwide, most of them in Germany, Bulgaria, Belgium, Austria, Sweden.
+
The total number of sensors is > 12000 worldwide, most of them in Germany, Bulgaria, Belgium, Austria, Sweden.
The Netherlands has about 100 sensors.
+
The Netherlands has > 2000 sensors. See also [https://stats.sensor.community/].
  
 
Future activities:
 
Future activities:
* plot PM2.5 and apply humidity correction
+
* play with different algorithm:
* plot official [https://www.luchtmeetnet.nl/ luchtmeetnet] measurements on the map too, access to the API has been implemented:[https://github.com/bertrik/luftdatenmapper/tree/master/workspace/luftdatenmapper/src/main/java/nl/bertriksikken/luchtmeetnet/api work in progress]
+
** for each pixel consider only stations with a certain radius (say 10 km), calculate the median, convert median to color
 
+
** should more naturally filter out outliers and is actually a bit similar to what sensor.community uses.
* review and update stats mentioned above (e.g. "100 sensors in netherlands")
+
* plot official [https://www.luchtmeetnet.nl/ luchtmeetnet] measurements on the map too, access to the API has been implemented [https://github.com/bertrik/luftdatenmapper/tree/master/luftdatenmapper/src/main/java/nl/bertriksikken/luchtmeetnet/api work in progress]
* verify that gzip compression is used on the sensor.community JSON download
+
* <s>base humidity on BME280 sensors only, ignore DHT11/22 type sensors</s>
 +
* add a water mark
  
 
== Visualisation ==
 
== Visualisation ==
Line 32: Line 33:
  
 
=== Background map ===
 
=== Background map ===
Pages to investigate:
+
The map background on stofradar.nl is based on https://mapsvg.com/maps/netherlands
* https://wiki.openstreetmap.org/wiki/OSM_on_Paper
 
* http://maps.stamen.com/m2i/#toner-background/600:800/6/52.200/5.300
 
* https://developer.mapquest.com/documentation/open/static-map-api/v5/examples/basic/map-bounding-box/
 
  
What I need/want is to be able to specify the 'bounding box' and use a '''equirectangular projection''' ,
+
The map projection used is the '''equirectangular projection''' (EPSG-32662),
 
so I can easily map a pixel back to a latitude/longitude.
 
so I can easily map a pixel back to a latitude/longitude.
  
The equirectangular projection is contained in standard EPSG-32662
+
=== Data filtering ===
 +
There is only very minimal data filtering. Sensor measurements are taken into account as follows:
 +
* Sensors from an area 2x2 times bigger than the area visualized are considered for visualisation
 +
* Sensors marked as 'indoor' are ignored
 +
* Sensors with a measurement value smaller than 0 are ignored
 +
* The top percent of highest PM2.5 concentrations is discarded, this mostly takes care of outliers caused by defective sensors
 +
* When sensor data is not available in the past 5 minutes, data from a previous measurement interval is used, up to 1 hour old
 +
* A (small) number of sensors that are known to always report a very high value are not considered (blacklisted)
  
 
=== Interpolation ===
 
=== Interpolation ===
Line 49: Line 54:
 
To calculate the distance, I use a very simple approximation:
 
To calculate the distance, I use a very simple approximation:
 
* calculate the "middle" of the map (average latitude/longitude between top-left and bottom-right);
 
* calculate the "middle" of the map (average latitude/longitude between top-left and bottom-right);
* calculate the "km-per-degree-latitude" at the middle for latitude as 40000 km / 360 degrees;
+
* calculate the "km-per-degree-latitude" at the middle for latitude as 40075 km / 360 degrees;
 
* calculate the "km-per-degree-longitude" at the middle for longitude as the number above multiplied with cos(latitude);
 
* calculate the "km-per-degree-longitude" at the middle for longitude as the number above multiplied with cos(latitude);
 
* determine the difference in longitude and the difference in latitude;
 
* determine the difference in longitude and the difference in latitude;
 
* convert both to km using the factors calculated earlier;
 
* convert both to km using the factors calculated earlier;
 
* calculate the [https://en.wikipedia.org/wiki/Euclidean_distance euclidean distance].
 
* calculate the [https://en.wikipedia.org/wiki/Euclidean_distance euclidean distance].
A better way would be to use the
 
[https://en.wikipedia.org/wiki/Great-circle_distance 'great-circle-distance'] and possibly even account for the fact that the earth is not perfectly spherical, but I like to start simple and this makes the calculation a lot faster.
 
  
Pixels that are not within a certain distance of any sensor station (e.g. 25 km) are rendered as grayscale, to indicate a geographic limit of each sensor.
+
Pixels that are not within a certain distance of any sensor station (e.g. 10 km) are rendered as grayscale, to indicate a geographic limit of each sensor.
  
Only sensors within a reasonable range of the map are taken into account, currently this is an area of 9 times (3x3) the visible area.
+
Only sensors within a reasonable range of the map are taken into account, currently this is an area of 4 times (2x2) the visible area.
  
PM10 values higher than 500 ug/m3 are simply ignored in the software.
+
PM10 values < 0 ug/m3 are ignored in the software.
  
 
=== Colour range ===
 
=== Colour range ===
The colours I'm using (a kind of inverted spectral range from blue to red):
+
[[File:luchtmeetnet_lki.png|right|thumb|Luchtmeetnet ranges]]
*  0 ug/m3: fully transparent white (#FFFFFF)
+
 
*  25 ug/m3: semi-transparent cyan (#00FFFF)
+
The colours I'm using are based on the scale used for air quality index from luchtmeetnet with data from RIVM,
*  50 ug/m3: semi-transparent yellow (#FFFF00). This is the threshold for the maximum daily average for PM10 in the Netherlands.
+
see https://www.luchtmeetnet.nl/informatie/luchtkwaliteit/luchtkwaliteitsindex-(lki)
* 100 ug/m3: semi-transparent red (#FF0000)
+
 
* 200 ug/m3 and higher: semi-transparent purple (#FF00FF)
+
The input value is the PM2.5 concentration.
 +
 
 
Values in between these levels are interpolated linearly with respect to the RGB colour value and alpha channel.
 
Values in between these levels are interpolated linearly with respect to the RGB colour value and alpha channel.
 
This scale is approximately logarithmic, with each step being twice as big as the previous one.
 
  
 
=== Correction for high humidity ===
 
=== Correction for high humidity ===
Line 87: Line 89:
 
* it combines formulas and coefficients from different sources where relative humidity has different units. One paper seems to use an RH-value from 0 to 100, while another uses a kind of normalized relative humidity (from 0 to 1). You cannot just use the same coefficients if the unit is different.
 
* it combines formulas and coefficients from different sources where relative humidity has different units. One paper seems to use an RH-value from 0 to 100, while another uses a kind of normalized relative humidity (from 0 to 1). You cannot just use the same coefficients if the unit is different.
 
* it claims a humidity correction for PM10 with coefficients that is not found in the source paper.
 
* it claims a humidity correction for PM10 with coefficients that is not found in the source paper.
 
=== Compositing ===
 
I use imagemagick for this, for example:
 
  composite -compose over -geometry 600x800 20180605_210100.json.png netherlands.png output.png
 
where netherlands.png is an 600x800 opaque black-and-white image of the map of the netherlands
 
and 20180605_210100.json.png is an 60x80 image of dust concentrations with an alpha channel
 
  
 
=== Animation ===
 
=== Animation ===
The idea of animation is to combine several 5-minute images into a movie.
+
Besides an image with current data from the last 5 minutes, every hour two animations are created:
 
+
* GIF animation composed of hourly images over the past 24 hours
There are several options and file formats to choose from:
+
* WEBM animation composed of 5-minute images over the past 24 hours
* GIF appears way too big, resulting in files of about 40 MB for 288 frames.
 
* APNG is only slighty smaller at about 37 MB.
 
* the webm format seems more suited, can be less than 1MB (lossy, VP9).
 
 
 
Example conversion command for webm:
 
  cat ~/luftdatenmapper/tmp/netherlands/*.png |ffmpeg -f image2pipe -r 12 -i - -b:v 200k netherlands.webm
 
  
This results in an output file of about 650 kB.
+
You can click on the GIF animation to view the WEBM animation.
  
 
== Software ==
 
== Software ==
 
See the [https://github.com/bertrik/luftdatenmapper github page] for the source code.
 
See the [https://github.com/bertrik/luftdatenmapper github page] for the source code.

Revision as of 08:08, 15 October 2021

Project Stofradar
Stofradar.png
Visualizing airborne particulate matter concentrations on a map
Status Completed
Contact bertrik
Last Update 2021-10-15

Introduction

This page is about creating a 'stofradar' image of atmospheric particulate matter concentrations based on the raw data measured by the sensor.community network, see www.stofradar.nl.

The focus is on raw visualisation of the source data, only the most minimal attempt is made to "validate" the data. Sensor measurements and sensor locations are basically uncontrolled, since we cannot tell if a particular sensor is defective or has an unusual position that affects its measurements.

See also my DustSensor page.

The website sensor.community is an initiative to allow citizens to participate in measuring atmospheric particulate matter concentration using an inexpensive and easy to build sensor. They collect this data, calculate 5 minute and daily averages and publish it again as open data. The total number of sensors is > 12000 worldwide, most of them in Germany, Bulgaria, Belgium, Austria, Sweden. The Netherlands has > 2000 sensors. See also [1].

Future activities:

  • play with different algorithm:
    • for each pixel consider only stations with a certain radius (say 10 km), calculate the median, convert median to color
    • should more naturally filter out outliers and is actually a bit similar to what sensor.community uses.
  • plot official luchtmeetnet measurements on the map too, access to the API has been implemented work in progress
  • base humidity on BME280 sensors only, ignore DHT11/22 type sensors
  • add a water mark

Visualisation

The general idea is to create an image, with a map at the background and the atmospheric particulate matter concentration overlaid on top.

Background map

The map background on stofradar.nl is based on https://mapsvg.com/maps/netherlands

The map projection used is the equirectangular projection (EPSG-32662), so I can easily map a pixel back to a latitude/longitude.

Data filtering

There is only very minimal data filtering. Sensor measurements are taken into account as follows:

  • Sensors from an area 2x2 times bigger than the area visualized are considered for visualisation
  • Sensors marked as 'indoor' are ignored
  • Sensors with a measurement value smaller than 0 are ignored
  • The top percent of highest PM2.5 concentrations is discarded, this mostly takes care of outliers caused by defective sensors
  • When sensor data is not available in the past 5 minutes, data from a previous measurement interval is used, up to 1 hour old
  • A (small) number of sensors that are known to always report a very high value are not considered (blacklisted)

Interpolation

Since we only have data at a set of discrete points, the concentration at other points is estimated by combining data from all sensors using inverse distance weighting, in particular using the distance *squared* as the weighing factor in a weighted average. So a nearby sensor has a large effect and a far away sensor has very little effect, contributing only a little bit to the global average.

To calculate the distance, I use a very simple approximation:

  • calculate the "middle" of the map (average latitude/longitude between top-left and bottom-right);
  • calculate the "km-per-degree-latitude" at the middle for latitude as 40075 km / 360 degrees;
  • calculate the "km-per-degree-longitude" at the middle for longitude as the number above multiplied with cos(latitude);
  • determine the difference in longitude and the difference in latitude;
  • convert both to km using the factors calculated earlier;
  • calculate the euclidean distance.

Pixels that are not within a certain distance of any sensor station (e.g. 10 km) are rendered as grayscale, to indicate a geographic limit of each sensor.

Only sensors within a reasonable range of the map are taken into account, currently this is an area of 4 times (2x2) the visible area.

PM10 values < 0 ug/m3 are ignored in the software.

Colour range

Luchtmeetnet ranges

The colours I'm using are based on the scale used for air quality index from luchtmeetnet with data from RIVM, see https://www.luchtmeetnet.nl/informatie/luchtkwaliteit/luchtkwaliteitsindex-(lki)

The input value is the PM2.5 concentration.

Values in between these levels are interpolated linearly with respect to the RGB colour value and alpha channel.

Correction for high humidity

Humidity generally seems to cause an overestimation of PM measurements for measurements done with a "particle counting" type of PM sensor. The effect become really significant above approximately 70% humidity.

An interesting idea is to try to compensate for this effect, since the sensor.community sensor has an onboard humidity-sensor. Some papers/links about this:

However, I see the following problems with the formulas and coefficient in the opendata-stuttgart link above:

  • it combines formulas and coefficients from different sources where relative humidity has different units. One paper seems to use an RH-value from 0 to 100, while another uses a kind of normalized relative humidity (from 0 to 1). You cannot just use the same coefficients if the unit is different.
  • it claims a humidity correction for PM10 with coefficients that is not found in the source paper.

Animation

Besides an image with current data from the last 5 minutes, every hour two animations are created:

  • GIF animation composed of hourly images over the past 24 hours
  • WEBM animation composed of 5-minute images over the past 24 hours

You can click on the GIF animation to view the WEBM animation.

Software

See the github page for the source code.