Snippets Groups Projects

2 months ago
5c75a3a8

docs #15 : Updated the docs for functionality · 5c75a3a8
Michael Loukeris authored 2 months ago

5c75a3a8

History

docs #15 : Updated the docs for functionality
Michael Loukeris authored 2 months ago

Code owners

Assign users and groups as approvers for specific file changes. Learn more.

functionality.rst 4.94 KiB

Functionality

In this section, we introduce all utility functions used to calculate HyperTool-specific attributes for Native Nodes. For each function, we specify how the resulting attribute is exposed in Kubernetes (as a label, annotation etc.), along with details about its dependencies, parameters, example output, and error handling.

interface_name_and_type

The interface_name_and_type annotations describe the network interface name and type used to reach a specific IP destination, such as the Kubernetes API server, using the pyroute2 library.

The output provides the interface name and network type (e.g., ethernet, wireless) that the system would use to route traffic to the specified IP address. This information is extracted by querying the system's routing table.

Kubernetes type: Annotation

Dependencies:

pyroute2 : IPRoute class for routing table queries

Parameters:

ip (str): The destination IP address to query routing information

Example Output:

hyperai.eu/node-interface: eth0
hyperai.eu/node-network-type: ethernet

Error Handling:

If no route is found or an error occurs, the function returns default values:

hyperai.eu/node-interface: "unknown"
hyperai.eu/node-network-type: "unknown"

node_available_interfaces

The node_available_interfaces annotation provides a list of all network interfaces available on the node . This information is useful for understanding the network configuration of the node.

Kubernetes type: Annotation

Dependencies:

netifaces : Provides a list of the available network interfaces on the system.

Parameters:

None

Example Output:

hyperai.eu/node-available-interfaces: eth0, eth1, wlan0

Error Handling:

There is no specific error handling for this function.

geolocation

The geolocation labels provide the geographical location of the node based on its public IP address.

Kubernetes type: Label

Dependencies:

geocoder : Used to retrieve the geolocation information based on the public IP address.

Parameters:

None

Example Output:

hyperai.eu/node-geolocation-city: Athens
hyperai.eu/node-geolocation-region: Attica
hyperai.eu/node-geolocation-country": GR

Error Handling:

If the geolocation cannot be determined, the function returns default values:

hyperai.eu/node-geolocation-city: "unknown"
hyperai.eu/node-geolocation-region: "unknown"
hyperai.eu/node-geolocation-country: "unknown"

get_monetary_cost_annotation

To infer the monetary cost category of a node (e.g., very low, low, medium, high, very high), we adopt a K-Means clustering approach trained on publicly available cloud instance pricing data.

Data Collection and Preprocessing

We collect on-demand instance data from AWS, available at: https://aws.amazon.com/ec2/pricing/on-demand/

This dataset includes the following attributes:

vCPU count
Memory (in GiB)
On-Demand Hourly Rate (in USD)

The data is cleaned to strip units (e.g., “GiB”, “$”) and converted into numeric format. These three numerical attributes are then normalized using the StandardScaler from the scikit-learn library.

Clustering

We use K-Means clustering with k = 5 to group instances into cost-based clusters. The feature vector x used during training is defined as:

x = [vCPU, Memory, Price]

The output of clustering is a set of k cluster centroids:

μ₁, μ₂, ..., μₖ ∈ ℝ³

Each cluster is ranked by average price and assigned a qualitative label:

very low, low, medium, high, very high

Cost Category Inference

At runtime, we infer the monetary cost category of a node based on its CPU and memory specifications:

x_inference = [vCPU, Memory, 0.0]

Key Insight: Although price was used during clustering to shape the cost groupings, it is excluded during inference. This is essential because the actual price of the current node is unknown.

We compute the Euclidean distance between the normalized input vector and the stored cluster centroids:

ŷ = argmin_i ||(x_inference - μᵢ) / σ||

Here, μᵢ is the i th centroid (trained with price included), and σ is the standard deviation vector used for normalization.

Kubernetes type: Annotation

Dependencies:

numpy : Used to load the K-Means model and perform distance calculations.

Parameters:

No Params

Example Output:

hyperai.eu/node-monetary-cost-category: very low | low | medium | high | very high

Error Handling: If the node's CPU or memory specifications are not available, the function returns default values:

hyperai.eu/node-monetary-cost-category: "unknown"

Copyright © Eclipse Foundation, Inc. All Rights Reserved. Privacy Policy | Terms of Use | Copyright Agent