...

/

Solution: Cumulative Penality Heuristic

Solution: Cumulative Penality Heuristic

Learn how to apply cumulative penalty heuristics to train a Bayesian network.

We'll cover the following...

Let's imagine this scenario: We are city planners for a small town with ten distinct locations (nodes) connected by roads (edges). The locations are represented by letters A to J, and the roads have different distances (weights) between them. The town map and distances between locations are as follows:

Press + to interact
Town map locations and road distances
Town map locations and road distances

When converting a network into a Bayesian network, each node represents a random variable, and each edge represents a conditional dependency between the connected nodes.

In this scenario, we're simulating data that represents the connections and distances between locations in a town. We are assigning numerical values to these connections to create a dataset that reflects the structure of the town.

Solution

Please find below the solution to the exercise in the previous lesson:

Press + to interact
# Simulate the continuous data
np.random.seed(42)
n_samples = 10000
# Simulate node A (mean: 1 std: 0.5)
A = np.random.normal(1, 0.5, n_samples)
# Simulate the other nodes
B = 5 * A
C = 3 * A
D = 2 * B + 4 * C
E = 1 * D + 6 * C
F = 3 * D
G = 2 * F + 5 * E
H = 4 * G
I = 2 * H + 7 * G
J = 1 * I
# Create the dataset
data = {"A": A, "B": B, "C": C, "D": D, "E": E, "F": F, "G": G, "H": H, "I": I, "J": J}
# Calculate the mean of each node
mean_values = {node: values.mean() for node, values in data.items()}
# Define thresholds as the mean value for each node
thresholds = mean_values
discrete_data = {node: (values > threshold).astype(int) for node, values, threshold in zip(data.keys(), data.values(), thresholds.values())}
# Convert the discrete data to a pandas DataFrame
df = pd.DataFrame(discrete_data)
# Define the structure of the Bayesian network
sm = StructureModel()
sm.add_edges_from([
('A', 'B'),
('A', 'C'),
('B', 'D'),
('C', 'D'),
('C', 'E'),
('D', 'E'),
('D', 'F'),
('E', 'G'),
('F', 'G'),
('G', 'H'),
('G', 'I'),
('H', 'I'),
('I', 'J'),
])
# Create the Bayesian Network
bn = BayesianNetwork(sm)
# Fit the Bayesian Network
bn = bn.fit_node_states(df)
bn = bn.fit_cpds(df, method="BayesianEstimator", bayes_prior="K2")
# BASELINE query:
ie = InferenceEngine(bn)
baseline = ie.query({})
rounded_baseline = {
outer_key: {inner_key: round(value, 1) for inner_key, value in inner_dict.items()}
for outer_key, inner_dict in baseline.items()}
print(rounded_baseline)
def test(n={}):
return rounded_baseline
  • Line 6: This line generates a simulated dataset for node A, using ...