一尘不染

地图聚类算法

algorithm

我当前的代码非常快,但是我需要使其更快,以便我们可以容纳更多的标记。有什么建议?

笔记:

  • 当按标记名称对SQL语句进行排序时,代码将以最快的速度运行-这本身就对标记进行了聚类(在相同位置的标记名称经常但不总是相似)。
  • 我无法预聚类标记,因为可以动态搜索和过滤标记。
  • 我已经尝试过基于网格的聚类-但是结果通常不是很好。
  • 我知道墨卡托投影上的星团有些偏斜。
  • 我对商业集群服务不感兴趣。

代码:

$singleMarkers = array();
$clusterMarkers = array();

while (count($markers)) {
    $marker  = array_pop($markers);
    $cluster = array();

    // Compare marker against all remaining markers.
    foreach ($markers as $key => $compareMarker) {
        // This function returns the distance between two markers, at a defined zoom level.
        $pixels = pixelDistance($marker['lat'], $marker['lng'], $compareMarker['lat'], $compareMarker['lng'], $zoomLevel);
        // If two markers are closer than defined distance, remove compareMarker from array and add to cluster.
        if ($pixels < $distance) {
            unset($markers[$key]);
            $cluster[] = $compareMarker;
        }
    }

    // If a marker was added to cluster, also add the marker we were comparing to.
    if (count($cluster) > 0) {
        $cluster[] = $marker;
        $clusterMarkers[] = $cluster;
    } else {
        $singleMarkers[] = $marker;
    }
}

function pixelDistance($lat1, $lon1, $lat2, $lon2, $zoom) {
    $x1 = $lon1*10000000; //This is what I did to compensate for using lat/lon values instead of pixels.
    $y1 = $lat1*10000000;
    $x2 = $lon2*10000000;
    $y2 = $lat2*10000000;

    return sqrt(pow(($x1-$x2),2) + pow(($y1-$y2),2)) >> (21 - $zoom); //21 is the max zoom level
}

更新

这是当前代码:

$singleMarkers = array();
$clusterMarkers = array();

// Minimum distance between markers to be included in a cluster, at diff. zoom levels
$DISTANCE = (10000000 >> $ZOOM) / 100000;

// Loop until all markers have been compared.
while (count($markers)) {
    $marker  = array_pop($markers);
    $cluster = array();

    // Compare against all markers which are left.
    foreach ($markers as $key => $target) {
        $pixels = abs($marker['lat']-$target['lat']) + abs($marker['lng']-$target['lng']);

        // If the two markers are closer than given distance remove target marker from array and add it to cluster.
        if ($pixels < $DISTANCE) {
            unset($markers[$key]);
            $cluster[] = $target;
        }
    }

    // If a marker has been added to cluster, add also the one we were comparing to.
    if (count($cluster) > 0) {
        $cluster[] = $marker;
        $clusterMarkers[] = $cluster;
    } else {
        $singleMarkers[] = $marker;
    }
}

阅读 259

收藏
2020-07-28

共1个答案

一尘不染

您实际上是否需要计算欧几里得距离?如果您只是比较距离的相对大小,则可以使用Manhattan距离来摆脱困境,这将为您节省两次致电到pow()一个sqrt()

function pixelDistance($lat1, $lon1, $lat2, $lon2, $zoom) {
    $x1 = $lon1*10000000; //This is what I did to compensate for using lat/lon values instead of pixels.
    $y1 = $lat1*10000000;
    $x2 = $lon2*10000000;
    $y2 = $lat2*10000000;

    return ($x1-$x2) + ($y1-$y2) >> (21 - $zoom);
}

不知道您是否需要该>> (21 - $zoom)位来进行计算,因此我将其保留。但是除非您确实需要在其他地方使用计算出的距离值,否则您可能可以直接使用纬度/经度就可以摆脱困境(无需乘以任何东西)
)并采用曼哈顿距离,假设您预先进行计算$distance以适合该度量,那么从计算角度讲,这比强迫所有距离都适合的单位和大小要便宜得多$distance

编辑:去年当我研究此问​​题时,我在Wikipedia上找到了一些有用的东西-是的,它可能会发生;-)

我也强烈推荐本书“ 编程集体智能:构建智能Web 2.0应用程序”
,该书深入探讨了群集,不仅适用于地理数据,还适用于数据分析的其他领域。

2020-07-28