バッチ処理モジュール

現時点で使用できるアルゴリズムは次のとおりです。

2D オブジェクトの密度ベースのクラスタリング (分散クラスタリング)

SBT

Maven

グレードル

libraryDependencies ++= Seq(
  "com.here.platform.location" %% "location-spark" % "0.21.788"
)

<dependencies>
    <dependency>
        <groupId>com.here.platform.location</groupId>
        <artifactId>location-spark_${scala.compat.version}</artifactId>
        <version>0.21.788</version>
    </dependency>
</dependencies>

dependencies {
    compile group: 'com.here.platform.location', name: 'location-spark_2.12', version:'0.21.788'
}

クラスタリング

クラスタリングアルゴリズムは、分散バージョンの DBScan を実装します。地理的に分散した地理位置のアイテムをクラスタ化します。アルゴリズムの動作を例証するには、次の図を参照してください。

入力データには、報告されたさまざまな旅程とイベントが含まれています。これらのイベントは緑色で示されます。この図は、これらのクラスタを示しています。各インスタンスについて、クラスタのインスタンスが返されます。青色のマーカーはクラスタの中心を示しています。

次のように DistributedClustering を使用できます。

Scala

import com.here.platform.location.spark.{Cluster, DistributedClustering}
import org.apache.spark.rdd.RDD

val events: RDD[Event] = mapPointsToEvents(sensorData)

val dc = new DistributedClustering[Event](neighborhoodRadiusInMeters = 20.0,
                                          minNeighbors = 3,
                                          partitionBufferZoneInMeters = 125.0)

val clusters: RDD[Cluster[EventWithPosition]] = dc(events)

val result = clusters.collect()
assert(result.nonEmpty)

// Print some statistics
val clusterCount = result.length
val clusteredEvents = result.map(_.events.length).sum.toDouble
println(s"Found $clusterCount clusters")
println(s"Found $clusteredEvents events in total.")
println(s"An average of ${clusteredEvents / clusterCount} event per cluster")

独自のEventタイプに対応する地理位置情報を抽出するには、 GeoCoordinateOperations の暗黙的なインスタンスが必要です。

バッチ処理

バッチ処理モジュール

クラスタリング

「」に一致する結果は件です

「」に一致する結果はありません

開発者

バッチ処理モジュール

クラスタリング

「」に一致する結果は 件です

「」に一致する結果はありません

開発者

「」に一致する結果は件です