Steam is a popular digital distribution platform developed by Valve Corporation, which offers a vast library of video games, software, and other multimedia content. It provides both a desktop application and a web platform, allowing users to purchase, download, and play their favorite games with ease. The platform is known for its vast selection, user-friendly interface, and frequent sales, making it a go-to choice for gamers worldwide.
Recommendation systems are intelligent algorithms designed to provide personalized suggestions to users based on their preferences, behavior, or other relevant factors. These systems analyze data patterns to predict and recommend items that are likely to be of interest, enhancing user experience and engagement with the platform.
Steam uses a recommendation system to suggest games that users might be interested in based on their play history, friends' activity, and other factors. However, not all users log in when browsing the Steam store through a web browser, resulting in a lack of access to personalized data such as their game library and play history. This limitation poses a challenge to provide relevant game recommendations to users who are not logged in.
The business question addressed in this project is:
How can a recommendation system for the Steam platform that relies solely on game similarities be built, without requiring users to log in or access their personal library and play history?
This approach will enable Steam to provide valuable suggestions to users who browse the platform without logging in, potentially increasing engagement and sales.
The following three datasets from Kaggle have been used in this project to build the game similarity-based recommendation system:
This dataset contains general information about the games available on Steam, including their title, release date, developer, publisher, genres, and other relevant details. The dataset has the following columns:
This dataset is essential for the project as it provides the basic information about the games and allows for identifying the key characteristics of each game to be used in the similarity calculations.
This dataset provides additional information about the games in the form of descriptive texts, including a short description, detailed description, and other related texts. The columns in this dataset are:
The descriptive texts in this dataset are important for understanding the content and context of each game. By analyzing these texts, relevant features can be extracted to determine the similarities between games.
This dataset contains user-generated tags for each game, which help categorize and describe the games in more detail. The columns in this dataset are:
The user-generated tags in this dataset provide valuable insights into the popular features and characteristics of each game, as perceived by the community. Analyzing these tags helps us to determine the similarities between games based on the preferences and opinions of Steam users.
By combining the information from these three datasets, a comprehensive understanding of each game's features, content, and user perception can be created. This data will enable us to build an effective similarity-based recommendation system for users who browse the Steam platform without logging in.
In this step, I imported the necessary libraries and load the three datasets into separate DataFrame objects.
I also set the index of each DataFrame to the unique game identifier (appid
or steam_appid
) for easier data manipulation. These values are key and foreign keys of the database.
import pandas as pd
import numpy as np
import spacy
from langdetect import detect
from datetime import datetime
from sklearn.preprocessing import MinMaxScaler
import pickle
from time import time
# Load Spacy's English model, using the medium-sized model for a balance between accuracy and processing time
nlp = spacy.load('en_core_web_md')
df = pd.read_csv('steam.csv')
df.set_index('appid', inplace=True)
tag_df = pd.read_csv('steamspy_tag_data.csv')
tag_df.set_index('appid', inplace=True)
des_df = pd.read_csv('steam_description_data.csv')
des_df.set_index('steam_appid', inplace=True)
pd.set_option('display.max_columns', 400)
Before building the recommendation system, I explored the steam
DataFrame and preprocess the data to ensure its quality and relevance.
average_playtime
column, making it unsuitable for use in the recommendation system.steam
DataFrame.print(df.shape)
df.head()
(27075, 17)
name | release_date | english | developer | publisher | platforms | required_age | categories | genres | steamspy_tags | achievements | positive_ratings | negative_ratings | average_playtime | median_playtime | owners | price | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
appid | |||||||||||||||||
10 | Counter-Strike | 2000-11-01 | 1 | Valve | Valve | windows;mac;linux | 0 | Multi-player;Online Multi-Player;Local Multi-P... | Action | Action;FPS;Multiplayer | 0 | 124534 | 3339 | 17612 | 317 | 10000000-20000000 | 7.19 |
20 | Team Fortress Classic | 1999-04-01 | 1 | Valve | Valve | windows;mac;linux | 0 | Multi-player;Online Multi-Player;Local Multi-P... | Action | Action;FPS;Multiplayer | 0 | 3318 | 633 | 277 | 62 | 5000000-10000000 | 3.99 |
30 | Day of Defeat | 2003-05-01 | 1 | Valve | Valve | windows;mac;linux | 0 | Multi-player;Valve Anti-Cheat enabled | Action | FPS;World War II;Multiplayer | 0 | 3416 | 398 | 187 | 34 | 5000000-10000000 | 3.99 |
40 | Deathmatch Classic | 2001-06-01 | 1 | Valve | Valve | windows;mac;linux | 0 | Multi-player;Online Multi-Player;Local Multi-P... | Action | Action;FPS;Multiplayer | 0 | 1273 | 267 | 258 | 184 | 5000000-10000000 | 3.99 |
50 | Half-Life: Opposing Force | 1999-11-01 | 1 | Gearbox Software | Valve | windows;mac;linux | 0 | Single-player;Multi-player;Valve Anti-Cheat en... | Action | FPS;Action;Sci-fi | 0 | 5250 | 288 | 624 | 415 | 5000000-10000000 | 3.99 |
len(df[(df['average_playtime']==0)])/len(df)
0.7721144967682364
print(df[df['english']==0].shape[0])
df = df[df['english']!=0]
511
df.drop(columns=['platforms', 'average_playtime', 'median_playtime', 'price',
'required_age', 'achievements', 'steamspy_tags', 'english'],
inplace=True)
df.head()
name | release_date | developer | publisher | categories | genres | positive_ratings | negative_ratings | owners | |
---|---|---|---|---|---|---|---|---|---|
appid | |||||||||
10 | Counter-Strike | 2000-11-01 | Valve | Valve | Multi-player;Online Multi-Player;Local Multi-P... | Action | 124534 | 3339 | 10000000-20000000 |
20 | Team Fortress Classic | 1999-04-01 | Valve | Valve | Multi-player;Online Multi-Player;Local Multi-P... | Action | 3318 | 633 | 5000000-10000000 |
30 | Day of Defeat | 2003-05-01 | Valve | Valve | Multi-player;Valve Anti-Cheat enabled | Action | 3416 | 398 | 5000000-10000000 |
40 | Deathmatch Classic | 2001-06-01 | Valve | Valve | Multi-player;Online Multi-Player;Local Multi-P... | Action | 1273 | 267 | 5000000-10000000 |
50 | Half-Life: Opposing Force | 1999-11-01 | Gearbox Software | Valve | Single-player;Multi-player;Valve Anti-Cheat en... | Action | 5250 | 288 | 5000000-10000000 |
In this step, I explored the des_df
DataFrame and preprocess the data to ensure its quality and relevance for building the recommendation system.
des_df.sample(5)
detailed_description | about_the_game | short_description | |
---|---|---|---|
steam_appid | |||
744550 | Top-down tactical shooter with stealth element... | Top-down tactical shooter with stealth element... | Tactical Operations is a subsystem of investig... |
875240 | Many of us live in a noisy urban environment. ... | Many of us live in a noisy urban environment. ... | An artistic bonsai tree simulation game that f... |
758100 | <h2 class="bb_tag"><strong>About E-Startup</st... | <h2 class="bb_tag"><strong>About E-Startup</st... | E-Startup is a sandbox business simulation gam... |
503480 | Run, jump and slash your way through an epic, ... | Run, jump and slash your way through an epic, ... | Mahluk is a hack and slash platformer game wit... |
995000 | You Play as Erica who follows the adventurous ... | You Play as Erica who follows the adventurous ... | Explore, find clues, solve puzzles and make yo... |
des_df = des_df[['about_the_game']].copy()
des_df
about_the_game | |
---|---|
steam_appid | |
10 | Play the world's number 1 online action game. ... |
20 | One of the most popular online action games of... |
30 | Enlist in an intense brand of Axis vs. Allied ... |
40 | Enjoy fast-paced multiplayer gaming with Death... |
50 | Return to the Black Mesa Research Facility as ... |
... | ... |
1065230 | <img src="https://steamcdn-a.akamaihd.net/stea... |
1065570 | Have you ever been so lonely that no one but y... |
1065650 | <strong>Super Star Blast </strong>is a space b... |
1066700 | Pursue a snow-white deer through an enchanted ... |
1069460 | A portal has opened and dark magic is pouring ... |
27334 rows × 1 columns
# Print five random 'about_the_game' contents
def show_rnd_strings(des_df):
rn_indexes = des_df['about_the_game'].sample(5).index
for i in rn_indexes:
print(des_df['about_the_game'].loc[i])
print()
# show_rnd_strings(des_df)
# Remove html code
pattern = '<.*?>'
des_df['about_the_game'] = des_df['about_the_game'].replace(pattern, ' ', regex=True)
# show_rnd_strings(des_df)
# remove non english content
def is_english(text):
try:
lang = detect(text)
if lang == 'en':
return True
else:
return False
except:
return False
# Filter rows that are in English
des_df = des_df[des_df['about_the_game'].apply(is_english)]
In this step, I explored the tag_df
DataFrame and preprocess the data to ensure its quality and relevance for building the recommendation system.
tag_df
DataFrame.print(len(tag_df[tag_df.sum(axis=1)==0]), 'games have no tags')
tag_df = tag_df[tag_df.sum(axis=1) != 0]
575 games have no tags
To ensure the recommendation system has sufficient data for each game, I selected only the games that have complete information across all three DataFrames.
I performed an inner join on the three DataFrames to select only games that have all information available.
print('df rows:', df.shape[0])
print('tag_df rows:', tag_df.shape[0])
print('des_df rows:', des_df.shape[0])
df rows: 26564 tag_df rows: 28447 des_df rows: 27085
df = df.join(des_df, how='inner')
df = df.join(tag_df, how='inner')
print(df.shape)
(26507, 381)
To facilitate further data processing and feature engineering, separate DataFrames are created for each relevant aspect of the games.
name_df = df[['name']].copy()
gen_df = df[['genres']].copy()
cat_df = df[['categories']].copy()
tag_df = tag_df.loc[:, '1980s':].copy()
rating_df = df[['positive_ratings', 'negative_ratings', 'owners']].copy()
date_df = df[['release_date']].copy()
dev_df = df[['developer']].copy()
pub_df = df[['publisher']].copy()
des_df = df[['about_the_game']].copy()
scores_df = df[[]].copy()
Category and genre data are stored as single strings with multiple elements separated by semicolons. To facilitate the use of Jaccard similarity later on, these strings are transformed into lists.
def get_all_list_elements(df, column):
df_genres = df.explode(column)
unique_genres = df_genres[column].unique()
# Print the unique genres
return (list(unique_genres))
cat_df['cat_list'] = cat_df['categories'].apply(lambda x: x.split(';'))
cat_df.drop(columns=['categories'], inplace=True)
print(get_all_list_elements(cat_df, 'cat_list'))
cat_df.sample(5)
['Multi-player', 'Online Multi-Player', 'Local Multi-Player', 'Valve Anti-Cheat enabled', 'Single-player', 'Steam Cloud', 'Steam Achievements', 'Steam Trading Cards', 'Captions available', 'Partial Controller Support', 'Includes Source SDK', 'Cross-Platform Multiplayer', 'Stats', 'Commentary available', 'Includes level editor', 'Steam Workshop', 'In-App Purchases', 'Co-op', 'Full controller support', 'Steam Leaderboards', 'SteamVR Collectibles', 'Online Co-op', 'Shared/Split Screen', 'Local Co-op', 'MMO', 'VR Support', 'Mods', 'Mods (require HL2)', 'Steam Turn Notifications']
cat_list | |
---|---|
705210 | [Single-player, Multi-player, Online Multi-Pla... |
701900 | [Single-player, Partial Controller Support] |
240360 | [Single-player, Steam Achievements, Partial Co... |
854740 | [Single-player] |
532840 | [Single-player, Steam Achievements] |
keep_set = {'Co-op', 'Local Co-op', 'Local Multi-Player', 'MMO', 'Multi-player', 'Online Co-op',
'Online Multi-Player', 'Shared/Split Screen', 'Single-player'}
cat_df['cat_list'] = cat_df['cat_list'].apply(lambda x: list(set(x).intersection(keep_set)) if len(x)>0 else ['unknown'])
cat_df['cat_list'] = cat_df['cat_list'].apply(lambda x: x if len(x)>0 else ['unknown'])
get_all_list_elements(cat_df, 'cat_list')
['Multi-player', 'Online Multi-Player', 'Local Multi-Player', 'Single-player', 'Co-op', 'unknown', 'Online Co-op', 'Shared/Split Screen', 'Local Co-op', 'MMO']
df[cat_df['cat_list'].apply(lambda x: True if 'unknown' in x else False)].shape[0]
186
gen_df['gen_list'] = gen_df['genres'].apply(lambda x: x.split(';'))
gen_df.drop(columns=['genres'], inplace=True)
print(get_all_list_elements(gen_df, 'gen_list'))
gen_df.sample(5)
['Action', 'Free to Play', 'Strategy', 'Adventure', 'Indie', 'RPG', 'Animation & Modeling', 'Video Production', 'Casual', 'Simulation', 'Racing', 'Violent', 'Massively Multiplayer', 'Nudity', 'Sports', 'Early Access', 'Gore', 'Utilities', 'Design & Illustration', 'Web Publishing', 'Education', 'Software Training', 'Sexual Content', 'Audio Production', 'Game Development', 'Photo Editing', 'Accounting', 'Documentary', 'Tutorial']
gen_list | |
---|---|
827900 | [Action, Casual, Indie] |
951010 | [Action, Adventure, Indie] |
819940 | [Action, Indie, Early Access] |
631990 | [Violent, Adventure, Indie, Strategy] |
681150 | [Action, Adventure, Casual, Indie, Racing, Spo... |
len(df[cat_df['cat_list'].apply(lambda x: True if 'MMO' in x else False)])#[['name','categories','genres']])
400
len(df[gen_df['gen_list'].apply(lambda x: True if 'Massively Multiplayer' in x else False)][['categories','genres']])
692
# Games that has MMO tag and Massively Multiplayer genre
mmo1 = len(df[(cat_df['cat_list'].apply(lambda x: True if 'MMO' in x else False))&
(gen_df['gen_list'].apply(lambda x: True if 'Massively Multiplayer' in x else False))])
# Games that has nor MMO tag but Massively Multiplayer genre
mmo2 = len(df[(cat_df['cat_list'].apply(lambda x: False if 'MMO' in x else True))&
(gen_df['gen_list'].apply(lambda x: True if 'Massively Multiplayer' in x else False))])
mmo1+mmo2
692
gen_df[(gen_df['gen_list'].apply(lambda x: True if 'Massively Multiplayer' in x else False))&
(gen_df['gen_list'].apply(lambda x: True if len(x)==1 else False))]
gen_list |
---|
i = cat_df[(gen_df['gen_list'].apply(lambda x: True if 'Massively Multiplayer' in x else False)) &
(cat_df['cat_list'].apply(lambda x: False if 'MMO' in x else True))].index
cat_df.loc[i, 'cat_list'] = np.array([x + ['MMO'] for x in cat_df.loc[i, 'cat_list']], dtype=object)
gen_df['gen_list'] = gen_df['gen_list'].apply(lambda x: [g for g in x if g != 'Massively Multiplayer'])
get_all_list_elements(gen_df, 'gen_list')
['Action', 'Free to Play', 'Strategy', 'Adventure', 'Indie', 'RPG', 'Animation & Modeling', 'Video Production', 'Casual', 'Simulation', 'Racing', 'Violent', 'Nudity', 'Sports', 'Early Access', 'Gore', 'Utilities', 'Design & Illustration', 'Web Publishing', 'Education', 'Software Training', 'Sexual Content', 'Audio Production', 'Game Development', 'Photo Editing', 'Accounting', 'Documentary', 'Tutorial']
To ensure that the tag data are on a consistent scale, the values in the tag_df
DataFrame are normalized to be between 0 and 1 by dividing each value by the maximum value in its row.
def divide_by_max(row):
max_val = row.max()
if max_val != 0:
return row / max_val
else:
return 0
tag_df = tag_df.apply(divide_by_max, axis=1)
tag_df.sample(5)
1980s | 1990s | 2.5d | 2d | 2d_fighter | 360_video | 3d | 3d_platformer | 3d_vision | 4_player_local | 4x | 6dof | atv | abstract | action | action_rpg | action_adventure | addictive | adventure | agriculture | aliens | alternate_history | america | animation_&_modeling | anime | arcade | arena_shooter | artificial_intelligence | assassin | asynchronous_multiplayer | atmospheric | audio_production | bmx | base_building | baseball | based_on_a_novel | basketball | batman | battle_royale | beat_em_up | beautiful | benchmark | bikes | blood | board_game | bowling | building | bullet_hell | bullet_time | crpg | capitalism | card_game | cartoon | cartoony | casual | cats | character_action_game | character_customization | chess | choices_matter | choose_your_own_adventure | cinematic | city_builder | class_based | classic | clicker | co_op | co_op_campaign | cold_war | colorful | comedy | comic_book | competitive | conspiracy | controller | conversation | crafting | crime | crowdfunded | cult_classic | cute | cyberpunk | cycling | dark | dark_comedy | dark_fantasy | dark_humor | dating_sim | demons | design_&_illustration | destruction | detective | difficult | dinosaurs | diplomacy | documentary | dog | dragons | drama | driving | dungeon_crawler | dungeons_&_dragons | dynamic_narration | dystopian_ | early_access | economy | education | emotional | epic | episodic | experience | experimental | exploration | fmv | fps | faith | family_friendly | fantasy | fast_paced | feature_film | female_protagonist | fighting | first_person | fishing | flight | football | foreign | free_to_play | funny | futuristic | gambling | game_development | gamemaker | games_workshop | gaming | god_game | golf | gore | gothic | grand_strategy | great_soundtrack | grid_based_movement | gun_customization | hack_and_slash | hacking | hand_drawn | hardware | heist | hex_grid | hidden_object | historical | hockey | horror | horses | hunting | illuminati | indie | intentionally_awkward_controls | interactive_fiction | inventory_management | investigation | isometric | jrpg | jet | kickstarter | lego | lara_croft | lemmings | level_editor | linear | local_co_op | local_multiplayer | logic | loot | lore_rich | lovecraftian | mmorpg | moba | magic | management | mars | martial_arts | massively_multiplayer | masterpiece | match_3 | mature | mechs | medieval | memes | metroidvania | military | mini_golf | minigames | minimalist | mining | mod | moddable | modern | motocross | motorbike | mouse_only | movie | multiplayer | multiple_endings | music | music_based_procedural_generation | mystery | mystery_dungeon | mythology | nsfw | narration | naval | ninja | noir | nonlinear | nudity | offroad | old_school | on_rails_shooter | online_co_op | open_world | otome | parkour | parody_ | party_based_rpg | perma_death | philisophical | photo_editing | physics | pinball | pirates | pixel_graphics | platformer | point_&_click | political | politics | pool | post_apocalyptic | procedural_generation | programming | psychedelic | psychological | psychological_horror | puzzle | puzzle_platformer | pve | pvp | quick_time_events | rpg | rpgmaker | rts | racing | real_time_tactics | real_time | real_time_with_pause | realistic | relaxing | remake | replay_value | resource_management | retro | rhythm | robots | rogue_like | rogue_lite | romance | rome | runner | sailing | sandbox | satire | sci_fi | science | score_attack | sequel | sexual_content | shoot_em_up | shooter | short | side_scroller | silent_protagonist | simulation | singleplayer | skateboarding | skating | skiing | sniper | snow | snowboarding | soccer | software | software_training | sokoban | souls_like | soundtrack | space | space_sim | spectacle_fighter | spelling | split_screen | sports | star_wars | stealth | steam_machine | steampunk | story_rich | strategy | strategy_rpg | stylized | submarine | superhero | supernatural | surreal | survival | survival_horror | swordplay | tactical | tactical_rpg | tanks | team_based | tennis | text_based | third_person | third_person_shooter | thriller | time_attack | time_management | time_manipulation | time_travel | top_down | top_down_shooter | touch_friendly | tower_defense | trackir | trading | trading_card_game | trains | transhumanism | turn_based | turn_based_combat | turn_based_strategy | turn_based_tactics | tutorial | twin_stick_shooter | typing | underground | underwater | unforgiving | utilities | vr | vr_only | vampire | video_production | villain_protagonist | violent | visual_novel | voice_control | voxel | walking_simulator | war | wargame | warhammer_40k | web_publishing | werewolves | western | word_game | world_war_i | world_war_ii | wrestling | zombies | e_sports | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
appid | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
338290 | 0.0 | 0.0 | 0.0 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.478261 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.956522 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.695652 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
754310 | 0.0 | 0.0 | 0.0 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.954545 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.954545 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.954545 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.227273 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
639720 | 0.0 | 0.0 | 0.0 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.000000 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.956522 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
977290 | 0.0 | 0.0 | 0.0 | 0.000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.909091 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00000 | 0.0 | 0.00000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
627410 | 0.0 | 0.0 | 0.0 | 0.375 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.000000 | 0.0 | 0.0 | 0.0 | 0.968750 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.34375 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.968750 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.34375 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.34375 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.968750 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.34375 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.34375 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.34375 | 0.0 | 0.34375 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
To better utilize the rating data and incorporate the number of owners as a weighting method for the rating, the rating_df
DataFrame is transformed as follows:
rating_score
by dividing the positive ratings by the total number of ratings.weighted_rating
by multiplying the rating_score
and the normalized owners
values.The advantages of this approach include:
rating_df['rating_score'] = rating_df['positive_ratings']/(rating_df['positive_ratings'] +
rating_df['negative_ratings'])
rating_df.drop(columns=['positive_ratings', 'negative_ratings'], inplace=True)
rating_df.sample(5)
owners | rating_score | |
---|---|---|
551750 | 0-20000 | 0.785714 |
656740 | 0-20000 | 0.965517 |
957390 | 0-20000 | 0.750000 |
304650 | 500000-1000000 | 0.828567 |
776000 | 0-20000 | 0.636364 |
display(rating_df['owners'].value_counts())
rating_df['owners'] = rating_df['owners'].apply(lambda x: 0.6 if x=='0-20000' else
0.8 if x=='20000-50000' else
0.9 if x=='50000-100000' else
1)
display(rating_df['owners'].value_counts())
rating_df['weighted_rating'] = rating_df['rating_score']*rating_df['owners']
rating_df.drop(columns=['owners','rating_score'], inplace=True)
rating_df.sample(5)
0-20000 18129 20000-50000 3012 50000-100000 1671 100000-200000 1368 200000-500000 1268 500000-1000000 512 1000000-2000000 284 2000000-5000000 191 5000000-10000000 45 10000000-20000000 21 20000000-50000000 3 50000000-100000000 2 100000000-200000000 1 Name: owners, dtype: int64
0.6 18129 1.0 3695 0.8 3012 0.9 1671 Name: owners, dtype: int64
weighted_rating | |
---|---|
731520 | 0.600000 |
342660 | 0.400000 |
231140 | 0.569004 |
523210 | 0.698113 |
778150 | 0.507692 |
To better utilize the release date data, the date_df
DataFrame is transformed as follows:
date_df['release_date'] = pd.to_datetime(date_df['release_date'])
reference_date = datetime.now()
date_df['days_since_release'] = (reference_date - date_df['release_date']).dt.days
scaler = MinMaxScaler()
date_df['days_norm'] = scaler.fit_transform(date_df[['days_since_release']])
date_df.drop(columns=['release_date', 'days_since_release'], inplace=True)
date_df.sample(5)
days_norm | |
---|---|
461350 | 0.073480 |
972740 | 0.021066 |
884410 | 0.035862 |
968470 | 0.021693 |
615910 | 0.086646 |
The dev_df
DataFrame contains many developer names with variations, additional characters, and sub-branches.
After exloring all developers, in order to better utilize this data, the following steps are taken:
dev_df.sample(5)
developer | |
---|---|
849870 | MAMMOSSIX Co., Ltd. |
720250 | Reflect Studios |
50910 | Big Fish Games |
709350 | Mad Head Games |
494990 | AK84C |
def devlist(string):
for char in '•.\',()?!/&\"+:[]_-{}®©':
string = string.replace(char, '')
for s in ['BANDAI NAMCO Entertainment Inc','BANDAI NAMCO Studio Inc','BANDAI NAMCO Studios Inc','BANDAI NAMCO Studios Vancouver','BANDAI NAMCO Studios']:
string = string.replace(s, 'BANDAI NAMCO')
for s in ['Aspyr Mac Linux Windows Update','Aspyr Linux','Aspyr Mac Linux','Aspyr Mac']:
string = string.replace(s, 'Aspyr')
for s in ['2K Australia','2K Boston','2K China','2K Czech','2K Marin']:
string = string.replace(s, '2K')
for s in ['Ubisoft Shanghaï','Ubisoft - San Francisco','Ubisoft Annecy','Ubisoft Belgrade','Ubisoft Blue Byte','Ubisoft Bucharest','Ubisoft Bulgaria','Ubisoft Entertainment','Ubisoft Kiev','Ubisoft Milan','Ubisoft Montpellier','Ubisoft Montreal','Ubisoft Montreal Studio','Ubisoft Montreal, Massive Entertainment, and Ubisoft Shanghai','Ubisoft Montreal, Red Storm, Shanghai, Toronto, Kiev','Ubisoft Montréal','Ubisoft Paris','Ubisoft Pune','Ubisoft Quebec','Ubisoft Quebec, in collaboration with Ubisoft Annecy, Bucharest, Kiev, Montreal, Montpellier, Shanghai, Singapore, Sofia, Toronto studios','Ubisoft Reflections','Ubisoft Romania','Ubisoft San Francisco','Ubisoft Shanghai','Ubisoft Singapore','Ubisoft Sofia','Ubisoft Toronto']:
string = string.replace(s, 'Ubisoft')
for s in ['Rockstar Leeds','Rockstar New England','Rockstar North','Rockstar North / Toronto','Rockstar Studios','Rockstar Toronto']:
string = string.replace(s, 'Rockstar Games')
string = string.replace('Alternative Software Ltd','Alternative Software')
string = string.replace('Alternative Dreams Studios','Alternative Dreams')
string = string.replace('ARTDINK CORPORATION', 'ARTDINK')
string = string.replace('4 Fun Studio Inc','4 Fun Studio')
string = string.replace('FrameLineNetwork Kft','FrameLineNetwork',)
string = string.replace('Flight Systems LLC', 'Flight Systems',)
string = string.replace('Feral Interactive MacLinux', 'Feral interactive')
string = string.replace('Feral interactive Mac', 'Feral interactive')
string = string.replace('Feral Interactive Linux', 'Feral interactive')
string = string.replace('FarSight Studios Inc', 'FarSight Studios')
string = string.replace('FELISTELLA Co Ltd','FELISTELLA')
string = string.replace('ERS Game Studios','ERS GStudios')
string = string.replace('ERS Games Studio','ERS GStudios')
string = string.replace('Evil Tortilla Games Incorporated','Evil Tortilla Games')
string = string.replace('ECC GAMES SP Z OO', 'ECC GAMES SA')
string = string.replace('Deceptive Games Ltd', 'Deceptive Games')
string = string.replace('CyberConnect2 Co Ltd', 'CyberConnect2')
string = string.replace('CAPCOM CO LTD', 'CAPCOM')
string = string.replace('CAPCOM Co Ltd', 'CAPCOM')
string = string.replace('VRROOM Ultimate VR Experiences BV', 'VRROOM Ultimate VR Experiences')
string = string.replace('Subaltern Games LLC','Subaltern Games')
string = string.replace('Stumphead GamesLLC','Stumphead Games')
string = string.replace('Stainless GamesLtd','Stainless Games')
string = string.replace('Square Enix Montréal', 'Square Enix')
string = string.replace('Spark Plug Games LLC','Spark Plug Games')
string = string.replace('Spaces of Play UG','Spaces of Play')
string = string.replace('Sanzaru Games Inc','Sanzaru Games')
string = string.replace('Rocketeer Games Studio LLC','Rocketcat Games')
string = string.replace('Random Thoughts Enterainment','Random Thoughts Entertainment')
string = string.replace('Napoleon Games sro', 'Napoleon Games')
string = string.replace('Monolith Productions, Inc','Monolith Productions')
string = string.replace('Modern Dream Ltd','Modern Dream')
string = string.replace('McMagic Productions sro','McMagic Productions')
string = string.replace('Kverta Limited', 'Kverta')
string = string.replace('Kool2Play Sp z oo', 'Kool2Play')
string = string.replace('Independent Arts Software GmbH','Independent Arts Software')
string = string.replace('Immersive VR Education Ltd','Immersive VR Education PLC')
string = string.replace('FromSoftware Inc', 'FromSoftware')
string = string.replace('Frima Studio Inc','Frima')
string = string.replace('Frima Studio','Frima')
string = string.lower()
string = string.replace(' ', '')
return string.split(";")
dev_df['dev_list'] = dev_df['developer'].apply(devlist)
dev_df.drop(columns=['developer'], inplace=True)
dev_df.sample(5)
dev_list | |
---|---|
7220 | [pendulostudios] |
464500 | [frontwing] |
428240 | [sharktreestudios] |
827270 | [salsawi] |
6420 | [mithisgames, thqnordic] |
Similar to the developer data, the pub_df
DataFrame also contains variations and additional characters in the publisher names. Therefore, to better utilize this data, the same steps are taken:
def publist(string):
for char in '•.\',\\()?!/&\"#+:[]_-{}=%®©@$':
string = string.replace(char, '')
for s in ['Aspyr Mac Linux','Aspyr Linux','Aspyr Mac']:
string = string.replace(s, 'Aspyr')
for s in ['Bandai Namco', 'Bandai Namco Entertainment', 'BANDAI NAMCO Entertainment America', 'BANDAI NAMCO Entertainment Europe', 'BANDAI NAMCO Entertainment', 'BANDAI NAMCO Entertainement',]:
string = string.replace(s, 'BANDAI NAMCO')
for s in ['Bethesda Softworks','Bethesda-Soft','Bethesda-Softworks']:
string = string.replace(s, 'Bethesda')
for s in ['Dovetail Games - Fishing','Dovetail Games - Flight','Dovetail Games - TSW','Dovetail Games - Trains',]:
string = string.replace(s, 'Dovetail Games')
for s in ['CAPCOM CO LTD','CAPCOM Co Ltd','Capcom Co Ltd','Capcom USA Inc']:
string = string.replace(s, 'CAPCOM')
for s in ['Gaijin Distribution KFT','Gaijin Entertainment','Gaijin Entertainment Corporation','Gaijin inCubator']:
string = string.replace(s, 'Gaijin Games')
for s in ['Konami Digital Entertainement GmbH', 'Konami Digital Entertainment', 'Konami Digital Entertainment GmbH','Konami Digital Entertainment Inc']:
string = string.replace(s, 'Konami')
string = string.replace('Big Ant Studios Steam', 'Big Ant Studios')
string = string.replace( 'Big Fish Games Inc', 'Big Fish Games')
string = string.replace('Bitbox SL', 'Bitbox Ltd')
string = string.replace('Blazing Griffin Ltd','Blazing Griffin')
string = string.replace('Blob Games Studio','Blob Games')
string = string.replace('CM Softworks Inc','CM Softworks')
string = string.replace('Cartoon Network Games', 'Cartoon Network')
string = string.replace('CasGames', 'CasGame')
string = string.replace('Chorus Worldwide Games Limited','Chorus Worldwide')
string = string.replace('Circle 5 Studios','Circle 5')
string = string.replace('Crazy Rocks Studios','Crazy Rocks')
string = string.replace('Crazysoft Limited','Crazysoft Ltd')
string = string.replace('Crian Soft SA','Crian Soft')
string = string.replace('Empyrean Interactive','Empyrean')
string = string.replace('EuroVideo Medien GmbH','EuroVideo Medien')
string = string.replace('Fair Games Studio','Fair Games')
string = string.replace('Fantasy Flight Publishing Inc', 'Fantasy Flight Publishing')
string = string.replace('FarSight Studios Inc', 'FarSight Studios')
string = string.replace('Fatbot Games s r o', 'Fatbot Games')
string = string.replace('Fatmoth Interactive','Fatmoth')
string = string.replace('Feral Interactive MacLinux', 'Feral Interactive')
string = string.replace('Feral Interactive Linux', 'Feral Interactive')
string = string.replace('Feral Interactive Mac', 'Feral Interactive')
string = string.replace('Five Mind Creations UG haftungsbeschraenkt','Five Mind Creations')
string = string.replace('Fixpoint Productions Ltd', 'Fixpoint Productions')
string = string.replace('Flight Systems LLC','Flight Systems')
string = string.replace('Forthright Entertainment LLC','Forthright Entertainment')
string = string.replace('Fossil Games','Fossil')
string = string.replace('Frima Originals','Frima')
string = string.replace('Frima Studio','Frima')
string = string.replace('FromSoftware Japan','FromSoftware')
string = string.replace('FromSoftware Inc','FromSoftware')
string = string.replace('Game Troopers SL','Game Troopers')
string = string.replace('Gameforge 4D GmbHu202c', 'Gameforge 4D GmbH')
string = string.replace('GungHo Online Entertainment America Inc','GungHo Online Entertainment America')
string = string.replace('Hazardous Software Inc','Hazardous Software')
string = string.replace('Idea Factory International Inc','Idea Factory')
string = string.replace('Idea Factory International','Idea Factory')
string = string.replace('Immanitas Entertainment GmbH','Immanitas Entertainment')
string = string.replace('Immanitas Entertainment GmbH','Immanitas Entertainment PLC')
string = string.replace('Kagura Games Chinese Localization','Kagura Games')
string = string.replace('Kerberos Productions Inc','Kerberos Productions')
string = string.replace('Kool2Play Sp z oo','Kool2Play')
string = string.replace('LB Studios','LB')
string = string.replace('Lemondo Entertainment', 'Lemondo Games')
string = string.replace('Lofty','Loft')
string = string.replace('MAGES Inc','MAGES')
string = string.replace('MGP Studios', 'MGP')
string = string.replace('MK game production', 'MK Games')
string = string.replace('MK-ULTRA Games', 'MK Games')
string = string.replace('MLBcom','MLB')
string = string.replace('Mayflower Entertainment KR','Mayflower Entertainment')
string = string.replace('McMagic Productions sro','McMagic Productions')
string = string.replace('NS STUDIO','NS')
string = string.replace('Nexon America Inc','Nexon',)
string = string.replace('Nexon America','Nexon',)
string = string.replace('Nexon Korea Corporation','Nexon')
string = string.replace('Oddworld Inhabitants Inc','Oddworld Inhabitants')
string = string.replace('Outright Games Ltd', 'Outright Games')
string = string.replace('Perfect Square Studios LLC','Perfect Square Studios')
string = string.replace('Praxia Entertainment Inc','Praxia Entertainment')
string = string.replace('SelfPublished','Selfp')
string = string.replace('Ubisoft Entertainment','Ubisoft')
string = string.replace( 'Viva Media Inc', 'Viva Media')
string = string.replace('Warner Bros Interactive Entertainment','Warner Bros')
string = string.replace('Warner Bros Interactive','Warner Bros')
string = string.replace('crowgames UG haftungsbeschränkt','crowgames')
string = string.replace('方块游戏 Asia', '方块游戏')
string = string.replace('方块游戏CubeGame', '方块游戏')
string = string.lower()
string = string.replace(' ', '')
if string == '':
string = 'unknown'
return string.split(";")
pub_df['pub_list'] = pub_df['publisher'].apply(publist)
pub_df.drop(columns=['publisher'], inplace=True)
pub_df.sample(5)
pub_list | |
---|---|
985930 | [watertemplestudio] |
502500 | [bandainamco] |
881250 | [fyrg] |
544790 | [aelentertainment] |
997280 | [flashynurav] |
The des_df DataFrame contains the 'about_the_game' descriptions for the games. To convert these descriptions into numerical vectors that can be used for similarity calculations, I used the doc.vector to call tex embedding vectorizer function, from the SpaCy library, obtaining a 300-dimensional vector representation of text data. This method is useful because it takes into account the context and the meaning of the words.
Using SpaCy's doc.vector is useful because it provides a compact numerical representation of the text, which can be utilized for similarity calculations in the recommendation system.
SpaCy is a popular open-source library for natural language processing in Python. It is designed to be fast, efficient, and easy to use, offering various capabilities such as tokenization, part-of-speech tagging, dependency parsing, and more.
The advantage of using SpaCy for processing game descriptions is that it simplifies the process of converting text data into numerical vectors while maintaining the semantic relationships between the words. This allows the recommendation system to effectively capture the similarities between games based on their descriptions.
def get_vector(text):
doc = nlp(text)
return doc.vector
start_time = time()
des_vectors = des_df['about_the_game'].apply(get_vector)
print(f"Elapsed time: {time()-start_time} seconds")
des_vectors_df = pd.DataFrame(des_vectors.to_list())
des_vectors_df = des_vectors_df.rename(columns={(i): f'description V_{i+1}' for i in range(0, len(des_vectors_df.columns)+1)})
des_vectors_df.index = des_df.index
des_vectors_df.sample(5)
Elapsed time: 839.2062339782715 seconds
description V_1 | description V_2 | description V_3 | description V_4 | description V_5 | description V_6 | description V_7 | description V_8 | description V_9 | description V_10 | description V_11 | description V_12 | description V_13 | description V_14 | description V_15 | description V_16 | description V_17 | description V_18 | description V_19 | description V_20 | description V_21 | description V_22 | description V_23 | description V_24 | description V_25 | description V_26 | description V_27 | description V_28 | description V_29 | description V_30 | description V_31 | description V_32 | description V_33 | description V_34 | description V_35 | description V_36 | description V_37 | description V_38 | description V_39 | description V_40 | description V_41 | description V_42 | description V_43 | description V_44 | description V_45 | description V_46 | description V_47 | description V_48 | description V_49 | description V_50 | description V_51 | description V_52 | description V_53 | description V_54 | description V_55 | description V_56 | description V_57 | description V_58 | description V_59 | description V_60 | description V_61 | description V_62 | description V_63 | description V_64 | description V_65 | description V_66 | description V_67 | description V_68 | description V_69 | description V_70 | description V_71 | description V_72 | description V_73 | description V_74 | description V_75 | description V_76 | description V_77 | description V_78 | description V_79 | description V_80 | description V_81 | description V_82 | description V_83 | description V_84 | description V_85 | description V_86 | description V_87 | description V_88 | description V_89 | description V_90 | description V_91 | description V_92 | description V_93 | description V_94 | description V_95 | description V_96 | description V_97 | description V_98 | description V_99 | description V_100 | description V_101 | description V_102 | description V_103 | description V_104 | description V_105 | description V_106 | description V_107 | description V_108 | description V_109 | description V_110 | description V_111 | description V_112 | description V_113 | description V_114 | description V_115 | description V_116 | description V_117 | description V_118 | description V_119 | description V_120 | description V_121 | description V_122 | description V_123 | description V_124 | description V_125 | description V_126 | description V_127 | description V_128 | description V_129 | description V_130 | description V_131 | description V_132 | description V_133 | description V_134 | description V_135 | description V_136 | description V_137 | description V_138 | description V_139 | description V_140 | description V_141 | description V_142 | description V_143 | description V_144 | description V_145 | description V_146 | description V_147 | description V_148 | description V_149 | description V_150 | description V_151 | description V_152 | description V_153 | description V_154 | description V_155 | description V_156 | description V_157 | description V_158 | description V_159 | description V_160 | description V_161 | description V_162 | description V_163 | description V_164 | description V_165 | description V_166 | description V_167 | description V_168 | description V_169 | description V_170 | description V_171 | description V_172 | description V_173 | description V_174 | description V_175 | description V_176 | description V_177 | description V_178 | description V_179 | description V_180 | description V_181 | description V_182 | description V_183 | description V_184 | description V_185 | description V_186 | description V_187 | description V_188 | description V_189 | description V_190 | description V_191 | description V_192 | description V_193 | description V_194 | description V_195 | description V_196 | description V_197 | description V_198 | description V_199 | description V_200 | description V_201 | description V_202 | description V_203 | description V_204 | description V_205 | description V_206 | description V_207 | description V_208 | description V_209 | description V_210 | description V_211 | description V_212 | description V_213 | description V_214 | description V_215 | description V_216 | description V_217 | description V_218 | description V_219 | description V_220 | description V_221 | description V_222 | description V_223 | description V_224 | description V_225 | description V_226 | description V_227 | description V_228 | description V_229 | description V_230 | description V_231 | description V_232 | description V_233 | description V_234 | description V_235 | description V_236 | description V_237 | description V_238 | description V_239 | description V_240 | description V_241 | description V_242 | description V_243 | description V_244 | description V_245 | description V_246 | description V_247 | description V_248 | description V_249 | description V_250 | description V_251 | description V_252 | description V_253 | description V_254 | description V_255 | description V_256 | description V_257 | description V_258 | description V_259 | description V_260 | description V_261 | description V_262 | description V_263 | description V_264 | description V_265 | description V_266 | description V_267 | description V_268 | description V_269 | description V_270 | description V_271 | description V_272 | description V_273 | description V_274 | description V_275 | description V_276 | description V_277 | description V_278 | description V_279 | description V_280 | description V_281 | description V_282 | description V_283 | description V_284 | description V_285 | description V_286 | description V_287 | description V_288 | description V_289 | description V_290 | description V_291 | description V_292 | description V_293 | description V_294 | description V_295 | description V_296 | description V_297 | description V_298 | description V_299 | description V_300 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
877550 | -0.649386 | 0.224372 | -0.169371 | -0.092183 | -0.137173 | -0.006350 | -0.026544 | -0.126208 | 0.035112 | 1.690713 | -0.165936 | -0.007520 | 0.042527 | 0.105440 | -0.105865 | -0.072799 | -0.081827 | 1.151265 | -0.197947 | -0.018468 | -0.020434 | 0.076008 | 0.003608 | -0.183293 | -0.016910 | -0.007654 | -0.059822 | -0.117105 | -0.018925 | -0.078407 | -0.088125 | 0.019840 | -0.009074 | -0.058728 | 0.080029 | 0.150851 | -0.093613 | -0.006080 | 0.017362 | -0.028141 | 0.051252 | 0.030481 | 0.048463 | -0.052200 | 0.035033 | -0.027790 | 0.002606 | 0.000645 | -0.058073 | 0.029554 | -0.091156 | 0.073372 | 0.079692 | -0.028763 | 0.035600 | -0.027811 | 0.007625 | -0.079471 | -0.028919 | -0.045810 | -0.080888 | -0.152314 | 0.020351 | 0.124716 | 0.158610 | -0.077904 | 0.036878 | 0.080682 | 0.072709 | -0.083222 | 0.132690 | 0.088787 | 0.164616 | -0.042953 | 0.051268 | 0.135773 | 0.104722 | -0.047216 | 0.063846 | 0.277432 | -0.054022 | 0.025642 | 0.028132 | -0.048895 | 0.030594 | -0.115971 | 0.248120 | 0.007796 | 0.132818 | -0.083156 | -0.076292 | -0.032962 | -0.100542 | -0.020890 | -0.040570 | -0.183332 | 0.067861 | 0.008908 | 0.047232 | 0.009795 | -0.097037 | 0.032278 | -0.187841 | 0.076145 | 0.227575 | -1.110301 | 0.081972 | 0.053068 | -0.065730 | -0.033452 | -0.020543 | -0.214448 | 0.047634 | -0.101459 | -0.042403 | -0.018527 | -0.033634 | -0.112555 | -0.021737 | -0.062102 | 0.160297 | 0.042643 | -0.017503 | -0.102642 | -0.058316 | -0.020575 | -0.000879 | -0.055476 | -0.134960 | -0.060420 | 0.071668 | -0.015889 | 0.032420 | 0.070398 | 0.104451 | 0.016416 | -0.155071 | 0.030909 | -0.047037 | -0.036267 | -1.073535 | 0.148030 | 0.119607 | -0.018420 | -0.042086 | -0.072983 | -0.048722 | 0.059273 | 0.081160 | 0.012449 | 0.096475 | 0.085941 | -0.018025 | -0.044302 | 0.033690 | -0.050368 | -0.050470 | 0.110347 | -0.056966 | -0.150318 | -0.027803 | -0.011558 | 0.064578 | -0.030345 | -0.144209 | -0.019580 | 0.102926 | 0.056701 | 0.068235 | -0.045244 | -0.022948 | -0.079385 | 0.007999 | -0.096348 | 0.021549 | 0.007065 | -0.092538 | 0.086048 | 0.100618 | 0.011288 | 0.029184 | -0.045770 | -0.254859 | 0.026787 | -0.124710 | 0.076762 | -0.038709 | -0.028446 | 0.052395 | 0.084325 | 0.149372 | -0.017406 | -0.038108 | -0.008935 | -0.054106 | -0.088284 | -0.006665 | 0.005940 | -0.132082 | 0.129480 | 0.180135 | -0.183032 | -0.058148 | -0.022558 | -0.076067 | -0.034787 | 0.029743 | 0.044697 | 0.025878 | 0.150974 | 0.064110 | 0.052617 | -0.057285 | -0.139285 | 0.031531 | 0.030642 | 0.064393 | 0.104451 | -0.163035 | 0.042552 | -0.061950 | 0.000155 | -0.186792 | -0.160666 | 0.018428 | -0.090888 | -0.056562 | -0.005985 | 0.093922 | -0.010793 | 0.064704 | -0.123915 | 0.131797 | 0.042035 | 0.013927 | 0.037063 | 0.106059 | -0.056894 | -0.044943 | 0.015089 | -0.000715 | -0.006664 | -0.011467 | 0.145650 | 0.101000 | -0.093872 | 0.041384 | -0.162348 | -0.022080 | 0.097572 | 0.036966 | -0.045655 | 0.054232 | -0.100686 | 0.008708 | 0.258030 | -0.062083 | -0.067546 | 0.017579 | 0.056415 | -0.064178 | 0.085817 | 0.026065 | 0.153355 | 0.123855 | -0.081383 | -0.017336 | 0.085599 | 0.483322 | 0.083296 | 0.083550 | -0.070606 | -0.014624 | -0.171636 | -0.128808 | -0.002146 | -0.077234 | -0.028947 | -0.066267 | 0.238523 | 0.014185 | 0.010927 | -0.038806 | 0.051486 | -0.006357 | 0.139620 | 0.066644 | -0.144642 | 0.012529 | 0.057424 | -0.245627 | 0.043884 | -0.044055 | -0.031390 | 0.033107 | 0.073119 | -0.020698 | -0.111485 | 0.093064 | 0.070821 |
289050 | -0.678563 | 0.109212 | -0.092267 | -0.070556 | -0.116048 | -0.024126 | 0.105354 | -0.193509 | -0.013823 | 1.666345 | -0.159772 | -0.110628 | -0.066894 | -0.033297 | -0.155726 | 0.002756 | -0.029930 | 0.577382 | -0.135930 | 0.015403 | -0.041847 | -0.009819 | 0.014021 | -0.196949 | 0.014128 | 0.021051 | -0.061195 | -0.080006 | 0.011321 | -0.095713 | -0.081856 | 0.065284 | -0.121693 | 0.004392 | 0.142251 | 0.102765 | -0.043185 | 0.008789 | -0.068737 | -0.021715 | 0.042829 | -0.078851 | 0.039093 | 0.046695 | 0.003368 | -0.027253 | -0.004078 | 0.063503 | -0.037660 | 0.030557 | -0.078272 | 0.192679 | 0.017108 | -0.049160 | -0.067176 | 0.052986 | -0.029297 | -0.036030 | -0.000722 | -0.054326 | -0.128068 | -0.093689 | -0.058511 | 0.057572 | 0.155103 | -0.092667 | -0.007418 | -0.026682 | 0.055600 | -0.079925 | 0.126413 | 0.015118 | 0.091126 | -0.034464 | 0.010490 | 0.112926 | 0.044572 | 0.006610 | -0.008945 | 0.203252 | -0.044438 | 0.035402 | 0.028809 | -0.038101 | -0.034707 | -0.067619 | 0.436150 | -0.238241 | -0.092106 | -0.062949 | -0.041795 | 0.017790 | -0.137055 | -0.006438 | 0.077574 | -0.164820 | 0.059735 | 0.000913 | 0.035635 | 0.062196 | -0.052171 | 0.098422 | -0.073205 | -0.016653 | 0.225056 | -0.821774 | 0.000957 | 0.014453 | -0.045180 | -0.037843 | 0.003215 | -0.205181 | 0.008631 | -0.137966 | -0.128221 | -0.026102 | 0.033511 | -0.122218 | -0.099582 | -0.073982 | 0.086485 | 0.129198 | -0.069175 | -0.134088 | -0.074349 | 0.016677 | -0.002658 | -0.065289 | -0.108819 | -0.036872 | -0.062756 | 0.002367 | 0.050106 | 0.071377 | 0.118452 | -0.101748 | -0.226056 | -0.029189 | -0.015864 | -0.012954 | -1.604307 | 0.068893 | 0.162356 | -0.003724 | -0.063861 | -0.101299 | -0.003627 | 0.055232 | 0.045025 | -0.032367 | 0.009059 | 0.095598 | 0.034792 | 0.008972 | 0.061779 | 0.006628 | -0.063896 | 0.031445 | 0.055619 | -0.180806 | 0.006739 | -0.079237 | 0.107879 | 0.064798 | -0.087633 | -0.108104 | 0.057403 | 0.029834 | -0.035126 | -0.109145 | -0.007089 | -0.035582 | 0.061560 | -0.007450 | 0.108450 | -0.003660 | -0.158402 | 0.109052 | 0.093464 | 0.097649 | 0.031098 | 0.004956 | -0.218537 | 0.027296 | -0.029442 | 0.079217 | 0.012680 | 0.122902 | 0.126523 | 0.088914 | 0.178515 | -0.026301 | 0.011178 | 0.044579 | -0.102857 | -0.051030 | 0.026012 | -0.060497 | -0.090947 | 0.077025 | 0.055722 | -0.074165 | -0.043142 | -0.048438 | -0.061295 | -0.026774 | 0.024332 | 0.061668 | 0.014311 | 0.014354 | -0.083661 | 0.098248 | -0.016231 | -0.052086 | -0.000631 | 0.022981 | 0.087404 | 0.037593 | -0.087499 | -0.097873 | -0.045520 | 0.022420 | -0.244613 | -0.128650 | 0.073027 | -0.089220 | -0.037717 | 0.002689 | 0.086046 | -0.054995 | -0.043244 | -0.063563 | 0.134027 | 0.034764 | 0.067069 | 0.046736 | 0.069896 | -0.050472 | 0.029908 | 0.023652 | -0.011208 | -0.059376 | -0.021085 | 0.113839 | 0.062220 | -0.013458 | -0.047442 | -0.096473 | -0.009312 | 0.169313 | -0.034111 | 0.017541 | 0.092576 | -0.105726 | -0.025772 | 0.057717 | -0.069933 | -0.055698 | 0.017105 | 0.039777 | -0.112217 | 0.040816 | 0.042099 | 0.100657 | 0.164970 | 0.007267 | -0.039633 | 0.102633 | 0.287264 | 0.144313 | -0.009256 | -0.049762 | -0.114170 | -0.176824 | -0.090248 | 0.018588 | 0.026242 | -0.092021 | -0.069778 | 0.266826 | 0.025888 | 0.032228 | 0.020574 | 0.038715 | 0.046362 | 0.090846 | 0.102288 | -0.147711 | 0.067033 | 0.000194 | -0.123144 | 0.071683 | -0.090549 | -0.020991 | 0.051442 | 0.116055 | -0.024489 | -0.105676 | -0.041447 | 0.016192 |
949520 | -0.681060 | 0.165796 | -0.168786 | -0.068867 | -0.097624 | -0.013240 | 0.002835 | -0.033152 | -0.032600 | 1.785456 | -0.147226 | -0.058406 | -0.040769 | 0.073718 | -0.110687 | 0.008640 | -0.120204 | 0.967662 | -0.147418 | -0.041143 | -0.112475 | 0.013378 | -0.024542 | -0.179935 | -0.038180 | -0.057666 | 0.003302 | -0.030666 | -0.014980 | -0.089600 | -0.135664 | 0.029013 | -0.064283 | -0.132130 | 0.116652 | 0.154471 | -0.050914 | -0.012791 | -0.053942 | 0.002990 | 0.076852 | -0.047015 | 0.028102 | 0.008825 | 0.116441 | -0.089869 | -0.062812 | 0.057587 | -0.064002 | 0.002562 | -0.091362 | 0.150080 | 0.054921 | -0.029696 | 0.057744 | -0.006113 | -0.035607 | -0.042760 | -0.033376 | 0.016411 | -0.095160 | -0.117747 | -0.006239 | 0.110860 | 0.092427 | -0.124298 | 0.013267 | 0.003392 | 0.039793 | -0.073724 | 0.116131 | 0.122982 | 0.084283 | -0.055603 | 0.024551 | 0.092552 | 0.057962 | 0.011616 | 0.003328 | 0.100276 | -0.059413 | 0.100089 | 0.018891 | -0.075003 | 0.023544 | -0.177783 | 0.294543 | 0.002146 | -0.003182 | -0.094055 | -0.072219 | -0.016578 | -0.099222 | 0.006447 | -0.017512 | -0.133177 | 0.092617 | 0.000649 | 0.077833 | 0.018569 | -0.056202 | 0.086558 | -0.117845 | 0.087635 | 0.209523 | -1.288744 | 0.008774 | -0.019957 | -0.053178 | -0.049395 | 0.044294 | -0.163504 | -0.005612 | -0.163518 | -0.050044 | -0.008997 | 0.009156 | -0.155352 | -0.038218 | -0.083841 | 0.051065 | 0.082182 | 0.002979 | -0.184436 | -0.048802 | -0.057308 | 0.028518 | -0.062657 | -0.121690 | -0.033746 | 0.049186 | -0.098152 | 0.019104 | 0.102993 | 0.122453 | -0.008965 | -0.164927 | -0.005080 | 0.000277 | -0.016095 | -1.199968 | 0.173009 | 0.136468 | 0.003709 | -0.050728 | -0.071419 | -0.016197 | 0.084068 | 0.060479 | 0.028094 | 0.012008 | 0.125171 | 0.061460 | -0.023621 | 0.048840 | -0.001949 | -0.073082 | 0.111162 | -0.011045 | -0.232446 | 0.022400 | -0.012358 | 0.092508 | 0.026855 | -0.051834 | -0.011772 | 0.126912 | -0.010712 | -0.001322 | -0.080235 | -0.006948 | -0.092152 | 0.046607 | 0.009149 | 0.073571 | 0.027238 | -0.090820 | 0.039793 | 0.113466 | 0.021460 | 0.025584 | 0.019160 | -0.236894 | 0.067832 | -0.069940 | 0.020321 | -0.014690 | 0.010183 | 0.059359 | 0.040236 | 0.210519 | -0.067346 | 0.005344 | -0.000326 | -0.100435 | -0.045203 | -0.005489 | -0.019823 | -0.115117 | 0.087028 | 0.105900 | -0.057116 | -0.055341 | 0.024970 | -0.045354 | -0.022844 | 0.045031 | 0.061490 | 0.057450 | 0.060054 | -0.006977 | 0.043139 | -0.008951 | -0.122099 | 0.005003 | -0.004666 | 0.095825 | 0.069280 | -0.088716 | -0.094220 | -0.026982 | 0.070224 | -0.147054 | -0.172039 | 0.068861 | -0.106193 | -0.050245 | -0.019685 | 0.016759 | 0.042627 | 0.060573 | -0.088878 | 0.139842 | 0.041728 | 0.046463 | 0.028395 | 0.065652 | -0.094737 | 0.004241 | -0.054678 | 0.007783 | -0.020669 | -0.028855 | 0.140269 | 0.075509 | -0.049380 | -0.030435 | -0.160570 | -0.059293 | 0.160950 | -0.021707 | 0.002986 | 0.068587 | -0.061447 | -0.008782 | 0.239292 | 0.004756 | -0.034328 | 0.104294 | 0.097510 | -0.132555 | 0.029582 | 0.036944 | 0.084401 | 0.170800 | -0.067589 | -0.007728 | 0.061411 | 0.295613 | 0.132427 | 0.102741 | -0.075052 | -0.089357 | -0.198584 | -0.137018 | -0.000655 | -0.018674 | 0.038456 | -0.069301 | 0.271264 | 0.009263 | 0.004649 | -0.040036 | 0.037825 | 0.015660 | 0.110354 | 0.093324 | -0.182719 | 0.075589 | 0.013894 | -0.217667 | 0.098360 | -0.065591 | -0.071346 | -0.005297 | 0.171375 | -0.029797 | -0.141184 | 0.011374 | -0.013089 |
886780 | -0.723686 | 0.164918 | -0.131152 | -0.066012 | -0.119403 | -0.014556 | 0.025716 | -0.052308 | 0.049928 | 1.868257 | -0.163588 | -0.045404 | -0.020411 | 0.091899 | -0.017153 | -0.010425 | -0.077777 | 0.890749 | -0.210757 | 0.002479 | -0.020416 | 0.056310 | 0.013753 | -0.204423 | -0.043834 | -0.053660 | 0.009623 | -0.063982 | -0.046756 | -0.119715 | -0.080728 | 0.084166 | -0.073365 | -0.060282 | 0.129845 | 0.099481 | -0.076116 | 0.016798 | -0.034506 | 0.052107 | -0.008637 | -0.081529 | 0.023827 | -0.008580 | 0.105114 | 0.006250 | 0.005983 | 0.113867 | -0.089978 | -0.001447 | -0.034452 | 0.142011 | 0.116864 | -0.049447 | -0.001382 | 0.053121 | -0.004029 | -0.021927 | -0.000038 | 0.001132 | -0.089642 | -0.091047 | -0.006227 | 0.058648 | 0.130851 | -0.088845 | 0.044357 | -0.000916 | 0.078049 | -0.109550 | 0.092134 | 0.108277 | 0.119134 | -0.082925 | -0.007826 | 0.099479 | 0.098658 | -0.073429 | 0.007960 | 0.212660 | -0.070546 | -0.019626 | 0.026468 | -0.030213 | 0.034131 | -0.082151 | 0.216889 | -0.101102 | -0.012796 | -0.107720 | -0.013703 | -0.007920 | -0.087186 | 0.051123 | -0.005351 | -0.166236 | 0.079110 | 0.018162 | 0.029315 | 0.036660 | -0.038597 | 0.030275 | -0.118477 | 0.011278 | 0.212128 | -1.176506 | 0.061443 | 0.027816 | -0.045041 | 0.017962 | -0.030267 | -0.202670 | 0.019475 | -0.123596 | -0.064855 | 0.057290 | 0.019679 | -0.148597 | -0.033888 | -0.085452 | 0.114477 | 0.040542 | -0.017032 | -0.125351 | -0.027341 | -0.009539 | -0.012860 | -0.062135 | -0.132770 | -0.006001 | -0.012048 | -0.044032 | 0.047290 | 0.070807 | 0.129557 | -0.035963 | -0.166370 | -0.011365 | -0.045382 | -0.024614 | -1.354216 | 0.161241 | 0.085712 | 0.010078 | -0.036693 | -0.007498 | -0.026778 | 0.064163 | 0.073626 | -0.024493 | -0.030250 | 0.131388 | 0.059747 | -0.007005 | -0.032529 | -0.073974 | -0.017553 | 0.110213 | -0.021786 | -0.172281 | 0.000107 | 0.010849 | 0.044472 | 0.034493 | -0.070292 | -0.067825 | 0.073044 | -0.038468 | -0.008674 | -0.055527 | -0.028196 | -0.034080 | 0.082122 | -0.033235 | 0.043809 | 0.006419 | -0.102874 | 0.034654 | 0.181360 | 0.024187 | 0.009319 | -0.008791 | -0.204814 | 0.049321 | -0.106441 | 0.021153 | -0.007461 | -0.021797 | 0.024696 | 0.150996 | 0.135688 | -0.040321 | -0.030447 | 0.047498 | -0.050681 | -0.093079 | 0.015193 | -0.094193 | -0.090784 | 0.136584 | 0.130653 | -0.146277 | 0.000718 | 0.007981 | -0.116778 | -0.024586 | -0.005659 | 0.084167 | 0.033936 | 0.048238 | 0.020066 | 0.056296 | 0.025807 | -0.058774 | -0.022232 | -0.036466 | 0.116304 | 0.043381 | -0.113566 | 0.007660 | -0.027736 | 0.084799 | -0.187572 | -0.109295 | 0.011042 | -0.127151 | -0.075888 | 0.021356 | 0.079585 | 0.025478 | -0.013408 | -0.079472 | 0.128210 | 0.004822 | -0.029085 | -0.003305 | 0.019380 | -0.086531 | 0.022649 | -0.033449 | -0.025566 | -0.039737 | -0.020381 | 0.117777 | 0.098218 | -0.044115 | -0.013237 | -0.080241 | -0.015788 | 0.187272 | 0.077214 | -0.002124 | 0.021758 | -0.019549 | -0.061797 | 0.223339 | -0.041037 | -0.023931 | 0.034293 | 0.099071 | -0.108518 | 0.053382 | 0.027428 | 0.135720 | 0.100821 | -0.040613 | -0.037502 | 0.074776 | 0.381035 | 0.087978 | 0.097046 | -0.072669 | -0.175758 | -0.167479 | -0.111243 | -0.003511 | -0.053624 | -0.061003 | -0.044033 | 0.294909 | -0.002508 | 0.044555 | 0.000084 | 0.048923 | 0.026238 | 0.098434 | 0.074174 | -0.156650 | 0.090625 | 0.024589 | -0.146703 | 0.052310 | -0.161986 | -0.064251 | -0.037080 | 0.132006 | -0.044512 | -0.146552 | 0.053996 | 0.053079 |
862770 | -0.682849 | 0.170309 | -0.231356 | -0.094217 | -0.136815 | -0.068679 | -0.002768 | -0.122273 | 0.038507 | 2.028709 | -0.135112 | -0.101387 | -0.057832 | 0.056614 | -0.177816 | -0.022841 | -0.096315 | 0.709027 | -0.199330 | 0.033213 | -0.039119 | 0.032293 | 0.022690 | -0.211917 | -0.023907 | -0.048426 | -0.097131 | -0.107827 | -0.002865 | -0.165915 | -0.122423 | 0.068857 | -0.159815 | -0.068435 | 0.175638 | 0.152144 | -0.042198 | -0.001290 | 0.013768 | 0.042586 | 0.027681 | -0.054343 | -0.035703 | -0.006189 | 0.072237 | -0.083859 | -0.051633 | 0.098153 | -0.164706 | 0.026209 | -0.105801 | 0.238401 | 0.048860 | -0.008619 | 0.004743 | 0.050648 | -0.081439 | -0.068946 | 0.017485 | -0.038990 | -0.099314 | -0.118938 | -0.005817 | 0.073760 | 0.242943 | -0.117606 | 0.019671 | 0.008802 | 0.100667 | -0.066440 | 0.137499 | 0.092272 | 0.116666 | -0.049707 | -0.014240 | 0.110970 | 0.099217 | -0.141350 | -0.004902 | 0.231471 | -0.114457 | 0.005726 | 0.022973 | -0.066244 | 0.006616 | -0.086779 | 0.307084 | -0.203609 | -0.054981 | -0.118228 | -0.066743 | 0.013136 | -0.123476 | 0.007522 | 0.018734 | -0.205330 | 0.090131 | -0.036878 | 0.054900 | 0.027547 | -0.093224 | 0.102920 | -0.087827 | 0.020134 | 0.261152 | -1.172634 | -0.020813 | 0.006972 | -0.064017 | -0.035322 | -0.009502 | -0.219104 | -0.016765 | -0.206596 | -0.112912 | 0.053294 | 0.031006 | -0.183848 | -0.110365 | -0.121161 | 0.057413 | 0.073227 | 0.011546 | -0.181135 | -0.085888 | -0.043857 | 0.004942 | -0.060996 | -0.107527 | -0.001543 | -0.055558 | 0.004822 | 0.038277 | 0.077140 | 0.074018 | -0.112878 | -0.181409 | 0.026935 | -0.025865 | -0.071927 | -1.616935 | 0.198859 | 0.207191 | -0.024448 | -0.091263 | -0.035404 | -0.033805 | 0.053571 | 0.060930 | -0.034319 | 0.000552 | 0.148003 | 0.042058 | 0.065134 | 0.015708 | -0.025825 | -0.117955 | 0.066899 | 0.031643 | -0.182974 | -0.008936 | -0.057417 | 0.103808 | 0.075194 | -0.064962 | -0.068457 | 0.103667 | 0.010154 | 0.016475 | -0.095694 | -0.063738 | -0.097768 | 0.045053 | -0.064301 | 0.157072 | 0.004769 | -0.149810 | 0.102744 | 0.124299 | -0.005614 | 0.012693 | -0.059330 | -0.275666 | 0.137123 | -0.095945 | 0.071269 | -0.016135 | 0.008665 | 0.101784 | 0.150139 | 0.240777 | -0.070375 | -0.026369 | 0.021059 | -0.103141 | -0.068717 | -0.039507 | -0.080657 | -0.114225 | 0.144231 | 0.082233 | -0.094612 | 0.012410 | 0.012930 | -0.044801 | -0.010337 | 0.029677 | 0.065439 | 0.072764 | 0.088320 | 0.028538 | 0.094973 | 0.015604 | -0.056339 | 0.010292 | -0.055299 | 0.061106 | 0.075525 | -0.190454 | -0.107463 | -0.029405 | 0.074189 | -0.189891 | -0.157183 | 0.053969 | -0.159083 | -0.074799 | 0.036963 | 0.027582 | -0.019515 | 0.012080 | -0.103504 | 0.165947 | 0.009393 | 0.039939 | 0.059420 | 0.131620 | -0.043044 | -0.016938 | -0.034947 | 0.064932 | 0.009064 | 0.013792 | 0.222273 | 0.090078 | -0.098880 | -0.014916 | -0.182763 | -0.019419 | 0.259444 | -0.004355 | 0.001964 | 0.058178 | -0.064742 | -0.053085 | 0.186741 | -0.075461 | -0.063339 | 0.070639 | 0.154460 | -0.208072 | 0.059682 | 0.019610 | 0.110100 | 0.175815 | -0.023355 | -0.090083 | 0.097967 | 0.374020 | 0.148482 | 0.189816 | -0.110123 | -0.107294 | -0.240783 | -0.088121 | 0.045111 | -0.028933 | -0.035140 | -0.009299 | 0.264282 | -0.040687 | -0.038236 | -0.023089 | 0.034577 | 0.045744 | 0.112811 | 0.111821 | -0.214316 | 0.072317 | -0.043734 | -0.195971 | 0.073442 | -0.125935 | -0.047837 | -0.009339 | 0.236287 | -0.048770 | -0.142264 | 0.024141 | 0.030714 |
des_vectors_df[des_vectors_df.isna().any(axis=1)]
description V_1 | description V_2 | description V_3 | description V_4 | description V_5 | description V_6 | description V_7 | description V_8 | description V_9 | description V_10 | description V_11 | description V_12 | description V_13 | description V_14 | description V_15 | description V_16 | description V_17 | description V_18 | description V_19 | description V_20 | description V_21 | description V_22 | description V_23 | description V_24 | description V_25 | description V_26 | description V_27 | description V_28 | description V_29 | description V_30 | description V_31 | description V_32 | description V_33 | description V_34 | description V_35 | description V_36 | description V_37 | description V_38 | description V_39 | description V_40 | description V_41 | description V_42 | description V_43 | description V_44 | description V_45 | description V_46 | description V_47 | description V_48 | description V_49 | description V_50 | description V_51 | description V_52 | description V_53 | description V_54 | description V_55 | description V_56 | description V_57 | description V_58 | description V_59 | description V_60 | description V_61 | description V_62 | description V_63 | description V_64 | description V_65 | description V_66 | description V_67 | description V_68 | description V_69 | description V_70 | description V_71 | description V_72 | description V_73 | description V_74 | description V_75 | description V_76 | description V_77 | description V_78 | description V_79 | description V_80 | description V_81 | description V_82 | description V_83 | description V_84 | description V_85 | description V_86 | description V_87 | description V_88 | description V_89 | description V_90 | description V_91 | description V_92 | description V_93 | description V_94 | description V_95 | description V_96 | description V_97 | description V_98 | description V_99 | description V_100 | description V_101 | description V_102 | description V_103 | description V_104 | description V_105 | description V_106 | description V_107 | description V_108 | description V_109 | description V_110 | description V_111 | description V_112 | description V_113 | description V_114 | description V_115 | description V_116 | description V_117 | description V_118 | description V_119 | description V_120 | description V_121 | description V_122 | description V_123 | description V_124 | description V_125 | description V_126 | description V_127 | description V_128 | description V_129 | description V_130 | description V_131 | description V_132 | description V_133 | description V_134 | description V_135 | description V_136 | description V_137 | description V_138 | description V_139 | description V_140 | description V_141 | description V_142 | description V_143 | description V_144 | description V_145 | description V_146 | description V_147 | description V_148 | description V_149 | description V_150 | description V_151 | description V_152 | description V_153 | description V_154 | description V_155 | description V_156 | description V_157 | description V_158 | description V_159 | description V_160 | description V_161 | description V_162 | description V_163 | description V_164 | description V_165 | description V_166 | description V_167 | description V_168 | description V_169 | description V_170 | description V_171 | description V_172 | description V_173 | description V_174 | description V_175 | description V_176 | description V_177 | description V_178 | description V_179 | description V_180 | description V_181 | description V_182 | description V_183 | description V_184 | description V_185 | description V_186 | description V_187 | description V_188 | description V_189 | description V_190 | description V_191 | description V_192 | description V_193 | description V_194 | description V_195 | description V_196 | description V_197 | description V_198 | description V_199 | description V_200 | description V_201 | description V_202 | description V_203 | description V_204 | description V_205 | description V_206 | description V_207 | description V_208 | description V_209 | description V_210 | description V_211 | description V_212 | description V_213 | description V_214 | description V_215 | description V_216 | description V_217 | description V_218 | description V_219 | description V_220 | description V_221 | description V_222 | description V_223 | description V_224 | description V_225 | description V_226 | description V_227 | description V_228 | description V_229 | description V_230 | description V_231 | description V_232 | description V_233 | description V_234 | description V_235 | description V_236 | description V_237 | description V_238 | description V_239 | description V_240 | description V_241 | description V_242 | description V_243 | description V_244 | description V_245 | description V_246 | description V_247 | description V_248 | description V_249 | description V_250 | description V_251 | description V_252 | description V_253 | description V_254 | description V_255 | description V_256 | description V_257 | description V_258 | description V_259 | description V_260 | description V_261 | description V_262 | description V_263 | description V_264 | description V_265 | description V_266 | description V_267 | description V_268 | description V_269 | description V_270 | description V_271 | description V_272 | description V_273 | description V_274 | description V_275 | description V_276 | description V_277 | description V_278 | description V_279 | description V_280 | description V_281 | description V_282 | description V_283 | description V_284 | description V_285 | description V_286 | description V_287 | description V_288 | description V_289 | description V_290 | description V_291 | description V_292 | description V_293 | description V_294 | description V_295 | description V_296 | description V_297 | description V_298 | description V_299 | description V_300 |
---|
The proposed recommendation system pipeline relies on the availability of precomputed, normalized tables.
This approach is particularly beneficial due to the time-consuming nature of selecting English-only content and Spacy doc to vectors.
In order to efficiently utilize these tables, they can be computed beforehand and stored alongside the standard Steam tables provided. This will facilitate seamless incorporation of updated scores whenever any changes occur within the dataset.
For the purpose of storing and retrieving these tables, the Pickle library will be utilized within the pipeline.
stored_tables = {'scores_df': scores_df,
'dev_df': dev_df,
'pub_df': pub_df,
'tag_df': tag_df,
'gen_df': gen_df,
'cat_df': cat_df,
'rating_df': rating_df,
'date_df': date_df,
'des_vectors_df': des_vectors_df,
'name_df': name_df,
'title': df['name'],
'rel_date': df['release_date']}
with open('steam_eng_tables', 'wb') as f:
pickle.dump(stored_tables, f)
import numpy as np
import pandas as pd
import pickle
with open('steam_eng_tables', 'rb') as f:
stored_tables = pickle.load(f)
Functions to calculate similarity scores were created for each DataFrame using different methods based on the nature of the information stored:
def calculate_dev_score(game_devs, other_game_devs):
intersection = set(game_devs) & set(other_game_devs)
num_common_devs = len(intersection)
num_devs = len(game_devs)
num_other_dev = len(other_game_devs)
if num_common_devs == num_devs and num_common_devs == num_other_dev:
return 1
elif num_common_devs > 0 and num_common_devs >= (num_devs / 2):
return 0.8
elif num_common_devs > 0:
return 0.5
else:
return 0
def get_dev_scores(df,game_index):
target = df.loc[game_index][0]
dev_scores = dev_df.iloc[:,0].apply(lambda x: calculate_dev_score(target, x))
df.columns = ['dev_score']
return dev_scores.to_frame()
get_dev_scores(dev_df, 10).T
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 130 | 220 | 240 | 280 | 300 | 320 | 340 | 360 | 380 | 400 | 420 | 440 | 500 | 550 | 570 | 620 | 630 | 730 | 1002 | 1200 | 1250 | 1300 | 1500 | 1510 | 1520 | 1530 | 1600 | 1610 | 1630 | 1640 | 1670 | 1690 | 1700 | 1840 | 1900 | 1930 | 2100 | 2200 | 2210 | 2270 | 2280 | 2290 | 2300 | 2310 | 2320 | 2330 | 2340 | 2350 | 2360 | 2370 | 2390 | 2400 | 2450 | 2500 | 2520 | 2590 | 2600 | 2610 | 2620 | 2630 | 2640 | 2710 | 2720 | 2780 | 2800 | 2810 | 2820 | 2840 | 2850 | 2870 | 2900 | 2910 | 2920 | 2990 | 3010 | 3020 | 3050 | 3130 | 3170 | 3230 | 3260 | 3270 | 3330 | 3410 | 3480 | 3483 | 3490 | 3500 | 3510 | 3520 | 3530 | 3540 | 3560 | 3570 | 3580 | 3590 | 3600 | 3610 | 3620 | 3700 | 3710 | 3720 | 3730 | 3800 | 3810 | 3820 | 3830 | 3900 | 3910 | 3920 | 3960 | 3980 | 3990 | 4000 | 4100 | 4230 | 4290 | 4300 | 4420 | 4460 | 4470 | 4500 | 4520 | 4530 | 4560 | 4570 | 4580 | 4700 | 4720 | 4760 | 4770 | 4780 | 4800 | 4850 | 4870 | 4880 | 4890 | 4900 | 4920 | 6000 | 6010 | 6020 | 6030 | 6040 | 6060 | 6080 | 6090 | 6120 | 6200 | 6210 | 6220 | 6250 | 6270 | 6300 | 6310 | 6370 | 6400 | 6420 | 6510 | 6550 | 6800 | 6810 | 6830 | 6840 | 6850 | 6860 | 6870 | 6880 | 6900 | 6910 | 6920 | 6980 | 7000 | 7010 | 7020 | 7110 | 7200 | 7210 | 7220 | 7260 | 7340 | 7510 | 7520 | 7530 | 7600 | 7610 | 7620 | 7650 | 7660 | 7730 | 7760 | 7770 | ... | 1043480 | 1043500 | 1043510 | 1043560 | 1043580 | 1043610 | 1043680 | 1043730 | 1043740 | 1043890 | 1044170 | 1044200 | 1044240 | 1044340 | 1044350 | 1044450 | 1044530 | 1044630 | 1044640 | 1044770 | 1044830 | 1044840 | 1044920 | 1044950 | 1045020 | 1045080 | 1045220 | 1045300 | 1045530 | 1045740 | 1045850 | 1045930 | 1046030 | 1046070 | 1046110 | 1046230 | 1046240 | 1046330 | 1046370 | 1046430 | 1046490 | 1046530 | 1046560 | 1046670 | 1046750 | 1046820 | 1047120 | 1047140 | 1047160 | 1047190 | 1047240 | 1047670 | 1047720 | 1047780 | 1047910 | 1047960 | 1048000 | 1048040 | 1048080 | 1048100 | 1048170 | 1048260 | 1048410 | 1048470 | 1048570 | 1048640 | 1048830 | 1048850 | 1048920 | 1048960 | 1049070 | 1049080 | 1049090 | 1049140 | 1049230 | 1049270 | 1049370 | 1049420 | 1049660 | 1049680 | 1049730 | 1049800 | 1049840 | 1049910 | 1049930 | 1049950 | 1050010 | 1050150 | 1050190 | 1050210 | 1050230 | 1050240 | 1050430 | 1050470 | 1050690 | 1050730 | 1050760 | 1050870 | 1051130 | 1051160 | 1051170 | 1051250 | 1051280 | 1051310 | 1051500 | 1051530 | 1051810 | 1051830 | 1051840 | 1051890 | 1052010 | 1052070 | 1052220 | 1052480 | 1052760 | 1052850 | 1052870 | 1052900 | 1053040 | 1053060 | 1053090 | 1053160 | 1053190 | 1053250 | 1053300 | 1053650 | 1053660 | 1053680 | 1053730 | 1053740 | 1053780 | 1053960 | 1054240 | 1054250 | 1054560 | 1054650 | 1054750 | 1054790 | 1054900 | 1054930 | 1054980 | 1055000 | 1055090 | 1055140 | 1055430 | 1055620 | 1055690 | 1055770 | 1055890 | 1055970 | 1055990 | 1056260 | 1056470 | 1056660 | 1057180 | 1057390 | 1057420 | 1057430 | 1057460 | 1057500 | 1057660 | 1057690 | 1057710 | 1058150 | 1058350 | 1058430 | 1058590 | 1058660 | 1058910 | 1058930 | 1058940 | 1059090 | 1059190 | 1059280 | 1059500 | 1059710 | 1060030 | 1060110 | 1060170 | 1060300 | 1060440 | 1060770 | 1060870 | 1061230 | 1061470 | 1062120 | 1062670 | 1062880 | 1063060 | 1063230 | 1063560 | 1064060 | 1064580 | 1064890 | 1065160 | 1065230 | 1065570 | 1065650 | 1066700 | 1069460 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
dev_list | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
1 rows × 26507 columns
def calculate_pub_score(game_pubs, other_game_pubs):
intersection = set(game_pubs) & set(other_game_pubs)
num_common_pubs = len(intersection)
num_pubs = len(game_pubs)
num_other_pub = len(other_game_pubs)
if num_common_pubs == num_pubs and num_common_pubs == num_other_pub:
return 1
elif num_common_pubs > 0 and num_common_pubs >= (num_pubs / 2):
return 0.8
elif num_common_pubs > 0:
return 0.5
else:
return 0
def get_pub_scores(df,game_index):
target = df.loc[game_index][0]
pub_scores = pub_df.iloc[:,0].apply(lambda x: calculate_pub_score(target, x))
df.columns = ['pub_score']
return pub_scores.to_frame()
get_pub_scores(pub_df, 10).T
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 130 | 220 | 240 | 280 | 300 | 320 | 340 | 360 | 380 | 400 | 420 | 440 | 500 | 550 | 570 | 620 | 630 | 730 | 1002 | 1200 | 1250 | 1300 | 1500 | 1510 | 1520 | 1530 | 1600 | 1610 | 1630 | 1640 | 1670 | 1690 | 1700 | 1840 | 1900 | 1930 | 2100 | 2200 | 2210 | 2270 | 2280 | 2290 | 2300 | 2310 | 2320 | 2330 | 2340 | 2350 | 2360 | 2370 | 2390 | 2400 | 2450 | 2500 | 2520 | 2590 | 2600 | 2610 | 2620 | 2630 | 2640 | 2710 | 2720 | 2780 | 2800 | 2810 | 2820 | 2840 | 2850 | 2870 | 2900 | 2910 | 2920 | 2990 | 3010 | 3020 | 3050 | 3130 | 3170 | 3230 | 3260 | 3270 | 3330 | 3410 | 3480 | 3483 | 3490 | 3500 | 3510 | 3520 | 3530 | 3540 | 3560 | 3570 | 3580 | 3590 | 3600 | 3610 | 3620 | 3700 | 3710 | 3720 | 3730 | 3800 | 3810 | 3820 | 3830 | 3900 | 3910 | 3920 | 3960 | 3980 | 3990 | 4000 | 4100 | 4230 | 4290 | 4300 | 4420 | 4460 | 4470 | 4500 | 4520 | 4530 | 4560 | 4570 | 4580 | 4700 | 4720 | 4760 | 4770 | 4780 | 4800 | 4850 | 4870 | 4880 | 4890 | 4900 | 4920 | 6000 | 6010 | 6020 | 6030 | 6040 | 6060 | 6080 | 6090 | 6120 | 6200 | 6210 | 6220 | 6250 | 6270 | 6300 | 6310 | 6370 | 6400 | 6420 | 6510 | 6550 | 6800 | 6810 | 6830 | 6840 | 6850 | 6860 | 6870 | 6880 | 6900 | 6910 | 6920 | 6980 | 7000 | 7010 | 7020 | 7110 | 7200 | 7210 | 7220 | 7260 | 7340 | 7510 | 7520 | 7530 | 7600 | 7610 | 7620 | 7650 | 7660 | 7730 | 7760 | 7770 | ... | 1043480 | 1043500 | 1043510 | 1043560 | 1043580 | 1043610 | 1043680 | 1043730 | 1043740 | 1043890 | 1044170 | 1044200 | 1044240 | 1044340 | 1044350 | 1044450 | 1044530 | 1044630 | 1044640 | 1044770 | 1044830 | 1044840 | 1044920 | 1044950 | 1045020 | 1045080 | 1045220 | 1045300 | 1045530 | 1045740 | 1045850 | 1045930 | 1046030 | 1046070 | 1046110 | 1046230 | 1046240 | 1046330 | 1046370 | 1046430 | 1046490 | 1046530 | 1046560 | 1046670 | 1046750 | 1046820 | 1047120 | 1047140 | 1047160 | 1047190 | 1047240 | 1047670 | 1047720 | 1047780 | 1047910 | 1047960 | 1048000 | 1048040 | 1048080 | 1048100 | 1048170 | 1048260 | 1048410 | 1048470 | 1048570 | 1048640 | 1048830 | 1048850 | 1048920 | 1048960 | 1049070 | 1049080 | 1049090 | 1049140 | 1049230 | 1049270 | 1049370 | 1049420 | 1049660 | 1049680 | 1049730 | 1049800 | 1049840 | 1049910 | 1049930 | 1049950 | 1050010 | 1050150 | 1050190 | 1050210 | 1050230 | 1050240 | 1050430 | 1050470 | 1050690 | 1050730 | 1050760 | 1050870 | 1051130 | 1051160 | 1051170 | 1051250 | 1051280 | 1051310 | 1051500 | 1051530 | 1051810 | 1051830 | 1051840 | 1051890 | 1052010 | 1052070 | 1052220 | 1052480 | 1052760 | 1052850 | 1052870 | 1052900 | 1053040 | 1053060 | 1053090 | 1053160 | 1053190 | 1053250 | 1053300 | 1053650 | 1053660 | 1053680 | 1053730 | 1053740 | 1053780 | 1053960 | 1054240 | 1054250 | 1054560 | 1054650 | 1054750 | 1054790 | 1054900 | 1054930 | 1054980 | 1055000 | 1055090 | 1055140 | 1055430 | 1055620 | 1055690 | 1055770 | 1055890 | 1055970 | 1055990 | 1056260 | 1056470 | 1056660 | 1057180 | 1057390 | 1057420 | 1057430 | 1057460 | 1057500 | 1057660 | 1057690 | 1057710 | 1058150 | 1058350 | 1058430 | 1058590 | 1058660 | 1058910 | 1058930 | 1058940 | 1059090 | 1059190 | 1059280 | 1059500 | 1059710 | 1060030 | 1060110 | 1060170 | 1060300 | 1060440 | 1060770 | 1060870 | 1061230 | 1061470 | 1062120 | 1062670 | 1062880 | 1063060 | 1063230 | 1063560 | 1064060 | 1064580 | 1064890 | 1065160 | 1065230 | 1065570 | 1065650 | 1066700 | 1069460 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
pub_list | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 rows × 26507 columns
# normalize the vector lenghts to 1, needed to calculate cosine similarity
def normalize_df(df):
norm = np.linalg.norm(df, axis=1)[:, np.newaxis]
return df / norm
# calculate cosine similarity ongly for the target row, instead of the whole matrix
def single_row_cosine_similarity(df, target_row, title):
normalized_df = normalize_df(df)
target_row_normalized = normalized_df.loc[target_row]
cosine_similarities = normalized_df @ target_row_normalized
return pd.DataFrame(cosine_similarities, columns = [title])
display(single_row_cosine_similarity(tag_df, 10, 'tag_score').T)
display(single_row_cosine_similarity(des_vectors_df, 10, 'des_score').T)
appid | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 130 | 220 | 240 | 280 | 300 | 320 | 340 | 360 | 380 | 400 | 420 | 440 | 500 | 550 | 570 | 620 | 630 | 730 | 1002 | 1200 | 1250 | 1300 | 1309 | 1313 | 1500 | 1510 | 1520 | 1530 | 1600 | 1610 | 1630 | 1640 | 1670 | 1690 | 1700 | 1840 | 1900 | 1930 | 2100 | 2200 | 2210 | 2270 | 2280 | 2290 | 2300 | 2310 | 2320 | 2330 | 2340 | 2350 | 2360 | 2370 | 2390 | 2400 | 2420 | 2450 | 2500 | 2520 | 2540 | 2570 | 2590 | 2600 | 2610 | 2620 | 2630 | 2640 | 2700 | 2710 | 2720 | 2780 | 2800 | 2810 | 2820 | 2840 | 2850 | 2870 | 2900 | 2910 | 2920 | 2990 | 3010 | 3020 | 3050 | 3130 | 3170 | 3230 | 3260 | 3270 | 3300 | 3310 | 3320 | 3330 | 3340 | 3350 | 3360 | 3380 | 3390 | 3400 | 3410 | 3420 | 3430 | 3440 | 3450 | 3460 | 3480 | 3483 | 3490 | 3500 | 3510 | 3520 | 3530 | 3540 | 3560 | 3570 | 3580 | 3590 | 3600 | 3610 | 3620 | 3700 | 3710 | 3720 | 3730 | 3800 | 3810 | 3820 | 3830 | 3900 | 3910 | 3920 | 3960 | 3980 | 3990 | 4000 | 4100 | 4230 | 4290 | 4300 | 4420 | 4460 | 4470 | 4500 | 4520 | 4530 | 4560 | 4570 | 4580 | 4700 | 4720 | 4760 | 4770 | 4780 | 4800 | 4850 | 4870 | 4880 | 4890 | 4900 | 4920 | 6000 | 6010 | 6020 | 6030 | 6040 | 6060 | 6080 | 6090 | 6120 | 6200 | 6210 | 6220 | 6250 | 6270 | 6300 | 6310 | 6370 | 6400 | 6420 | 6510 | 6550 | 6600 | 6800 | 6810 | 6830 | 6840 | 6850 | 6860 | 6870 | 6880 | 6900 | 6910 | 6920 | ... | 1044340 | 1044350 | 1044450 | 1044530 | 1044630 | 1044640 | 1044770 | 1044830 | 1044840 | 1044920 | 1044950 | 1045010 | 1045020 | 1045080 | 1045130 | 1045140 | 1045220 | 1045300 | 1045530 | 1045650 | 1045740 | 1045850 | 1045930 | 1046030 | 1046070 | 1046110 | 1046230 | 1046240 | 1046330 | 1046370 | 1046430 | 1046490 | 1046530 | 1046560 | 1046670 | 1046750 | 1046770 | 1046820 | 1047120 | 1047140 | 1047160 | 1047190 | 1047240 | 1047670 | 1047680 | 1047720 | 1047780 | 1047910 | 1047960 | 1048000 | 1048040 | 1048080 | 1048100 | 1048170 | 1048260 | 1048320 | 1048410 | 1048470 | 1048570 | 1048640 | 1048830 | 1048850 | 1048920 | 1048960 | 1049040 | 1049070 | 1049080 | 1049090 | 1049140 | 1049230 | 1049270 | 1049370 | 1049420 | 1049660 | 1049680 | 1049730 | 1049800 | 1049840 | 1049910 | 1049930 | 1049950 | 1050010 | 1050150 | 1050190 | 1050210 | 1050230 | 1050240 | 1050430 | 1050470 | 1050690 | 1050730 | 1050760 | 1050870 | 1051130 | 1051160 | 1051170 | 1051250 | 1051280 | 1051310 | 1051500 | 1051530 | 1051810 | 1051830 | 1051840 | 1051890 | 1052010 | 1052070 | 1052220 | 1052480 | 1052760 | 1052850 | 1052870 | 1052900 | 1053040 | 1053060 | 1053090 | 1053160 | 1053190 | 1053250 | 1053300 | 1053650 | 1053660 | 1053680 | 1053730 | 1053740 | 1053780 | 1053960 | 1054240 | 1054250 | 1054560 | 1054650 | 1054750 | 1054790 | 1054900 | 1054930 | 1054980 | 1055000 | 1055090 | 1055140 | 1055430 | 1055620 | 1055690 | 1055770 | 1055890 | 1055970 | 1055990 | 1056010 | 1056260 | 1056470 | 1056500 | 1056660 | 1056710 | 1057180 | 1057390 | 1057420 | 1057430 | 1057460 | 1057500 | 1057660 | 1057690 | 1057710 | 1058000 | 1058150 | 1058350 | 1058430 | 1058590 | 1058660 | 1058910 | 1058930 | 1058940 | 1059090 | 1059190 | 1059280 | 1059500 | 1059710 | 1060030 | 1060110 | 1060170 | 1060300 | 1060440 | 1060770 | 1060870 | 1061230 | 1061470 | 1062120 | 1062240 | 1062670 | 1062880 | 1063060 | 1063230 | 1063560 | 1064060 | 1064580 | 1064890 | 1065160 | 1065230 | 1065570 | 1065650 | 1066700 | 1069460 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
tag_score | 1.0 | 0.891156 | 0.776308 | 0.904292 | 0.690284 | 0.792197 | 0.726906 | 0.931129 | 0.665625 | 0.640264 | 0.916603 | 0.695079 | 0.767209 | 0.814604 | 0.639661 | 0.862575 | 0.594849 | 0.254944 | 0.636142 | 0.602868 | 0.598991 | 0.557709 | 0.282683 | 0.240848 | 0.396257 | 0.839739 | 0.060527 | 0.689954 | 0.559951 | 0.711145 | 0.702534 | 0.702534 | 0.092829 | 0.070651 | 0.168675 | 0.145357 | 0.098314 | 0.051564 | 0.11766 | 0.16696 | 0.055965 | 0.077115 | 0.237902 | 0.09098 | 0.104958 | 0.179465 | 0.405826 | 0.818104 | 0.754705 | 0.722644 | 0.674439 | 0.754617 | 0.708495 | 0.778428 | 0.768244 | 0.801619 | 0.796759 | 0.860224 | 0.685159 | 0.78465 | 0.72819 | 0.508267 | 0.394593 | 0.768205 | 0.462547 | 0.0 | 0.199774 | 0.0 | 0.722086 | 0.179777 | 0.357376 | 0.735132 | 0.727403 | 0.732388 | 0.074445 | 0.46261 | 0.383483 | 0.696776 | 0.116421 | 0.084983 | 0.098045 | 0.076177 | 0.095356 | 0.143625 | 0.055868 | 0.065379 | 0.057562 | 0.297489 | 0.07901 | 0.587035 | 0.0 | 0.055837 | 0.047751 | 0.062019 | 0.023881 | 0.79677 | 0.0 | 0.0 | 0.171936 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.138078 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.027558 | 0.080214 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.036365 | 0.0 | 0.0 | 0.062307 | 0.0 | 0.0 | 0.292503 | 0.512285 | 0.724974 | 0.058072 | 0.710309 | 0.446946 | 0.361765 | 0.359216 | 0.212164 | 0.152255 | 0.176574 | 0.125584 | 0.067786 | 0.051764 | 0.06958 | 0.375346 | 0.0 | 0.0 | 0.0 | 0.387927 | 0.071383 | 0.046246 | 0.076032 | 0.403727 | 0.476985 | 0.467016 | 0.336281 | 0.22221 | 0.221519 | 0.21342 | 0.406992 | 0.226232 | 0.099047 | 0.137993 | 0.056085 | 0.270808 | 0.06541 | 0.065355 | 0.062641 | 0.0 | 0.683877 | 0.654513 | 0.12687 | 0.551265 | 0.639317 | 0.101432 | 0.616421 | 0.251049 | 0.58454 | 0.319625 | 0.1449 | 0.054187 | 0.230988 | 0.315484 | 0.0 | 0.0 | 0.10295 | 0.372578 | 0.389432 | 0.157198 | 0.408609 | 0.348775 | 0.0 | 0.39079 | 0.447314 | 0.412948 | 0.383464 | 0.443492 | 0.35464 | 0.449777 | 0.474605 | 0.43254 | 0.365199 | 0.451944 | ... | 0.049884 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.313318 | 0.405186 | 0.230858 | 0.269242 | 0.0 | 0.239286 | 0.0 | 0.120651 | 0.0 | 0.0 | 0.341589 | 0.068003 | 0.0 | 0.08797 | 0.0 | 0.0 | 0.0 | 0.037822 | 0.346326 | 0.0 | 0.0 | 0.385622 | 0.335769 | 0.0 | 0.330833 | 0.336082 | 0.291193 | 0.0 | 0.0 | 0.345946 | 0.0 | 0.0 | 0.0 | 0.264615 | 0.0 | 0.0 | 0.035159 | 0.23787 | 0.293063 | 0.0 | 0.57302 | 0.322645 | 0.0 | 0.0 | 0.260533 | 0.06642 | 0.23756 | 0.0 | 0.335844 | 0.204374 | 0.0 | 0.036448 | 0.0 | 0.254948 | 0.198825 | 0.0 | 0.0 | 0.2843 | 0.0 | 0.0 | 0.0 | 0.110498 | 0.0 | 0.0 | 0.0 | 0.258559 | 0.0 | 0.266951 | 0.330833 | 0.0 | 0.289811 | 0.246862 | 0.035575 | 0.0 | 0.351527 | 0.0 | 0.171247 | 0.0 | 0.0 | 0.325823 | 0.0 | 0.0 | 0.0 | 0.0 | 0.321213 | 0.330833 | 0.335844 | 0.325388 | 0.289899 | 0.242648 | 0.416006 | 0.047327 | 0.283023 | 0.038636 | 0.149316 | 0.212818 | 0.068855 | 0.43992 | 0.346326 | 0.049723 | 0.002388 | 0.0 | 0.0 | 0.011584 | 0.0 | 0.02938 | 0.29332 | 0.116428 | 0.47306 | 0.0 | 0.025955 | 0.36271 | 0.0 | 0.0 | 0.352223 | 0.19975 | 0.003882 | 0.336082 | 0.126195 | 0.2271 | 0.38637 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.335844 | 0.0 | 0.01536 | 0.053072 | 0.341089 | 0.04038 | 0.039211 | 0.0 | 0.356388 | 0.0 | 0.341314 | 0.036076 | 0.041242 | 0.011095 | 0.0 | 0.0 | 0.320078 | 0.0 | 0.0 | 0.0 | 0.228886 | 0.405186 | 0.389447 | 0.155952 | 0.0 | 0.0 | 0.069004 | 0.049944 | 0.1051 | 0.0 | 0.0 | 0.0 | 0.008012 | 0.070318 | 0.296561 | 0.039279 | 0.014396 | 0.0 | 0.304668 | 0.187702 | 0.0 | 0.0 | 0.0 | 0.405186 | 0.311302 | 0.0 | 0.049282 | 0.0 | 0.344758 | 0.0 | 0.266099 | 0.0 | 0.0 | 0.237425 | 0.0 | 0.289899 | 0.270747 | 0.064666 | 0.239978 | 0.0 | 0.09826 | 0.0 | 0.065322 | 0.0 | 0.454207 | 0.330833 | 0.0 | 0.0 |
1 rows × 28447 columns
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 130 | 220 | 240 | 280 | 300 | 320 | 340 | 360 | 380 | 400 | 420 | 440 | 500 | 550 | 570 | 620 | 630 | 730 | 1002 | 1200 | 1250 | 1300 | 1500 | 1510 | 1520 | 1530 | 1600 | 1610 | 1630 | 1640 | 1670 | 1690 | 1700 | 1840 | 1900 | 1930 | 2100 | 2200 | 2210 | 2270 | 2280 | 2290 | 2300 | 2310 | 2320 | 2330 | 2340 | 2350 | 2360 | 2370 | 2390 | 2400 | 2450 | 2500 | 2520 | 2590 | 2600 | 2610 | 2620 | 2630 | 2640 | 2710 | 2720 | 2780 | 2800 | 2810 | 2820 | 2840 | 2850 | 2870 | 2900 | 2910 | 2920 | 2990 | 3010 | 3020 | 3050 | 3130 | 3170 | 3230 | 3260 | 3270 | 3330 | 3410 | 3480 | 3483 | 3490 | 3500 | 3510 | 3520 | 3530 | 3540 | 3560 | 3570 | 3580 | 3590 | 3600 | 3610 | 3620 | 3700 | 3710 | 3720 | 3730 | 3800 | 3810 | 3820 | 3830 | 3900 | 3910 | 3920 | 3960 | 3980 | 3990 | 4000 | 4100 | 4230 | 4290 | 4300 | 4420 | 4460 | 4470 | 4500 | 4520 | 4530 | 4560 | 4570 | 4580 | 4700 | 4720 | 4760 | 4770 | 4780 | 4800 | 4850 | 4870 | 4880 | 4890 | 4900 | 4920 | 6000 | 6010 | 6020 | 6030 | 6040 | 6060 | 6080 | 6090 | 6120 | 6200 | 6210 | 6220 | 6250 | 6270 | 6300 | 6310 | 6370 | 6400 | 6420 | 6510 | 6550 | 6800 | 6810 | 6830 | 6840 | 6850 | 6860 | 6870 | 6880 | 6900 | 6910 | 6920 | 6980 | 7000 | 7010 | 7020 | 7110 | 7200 | 7210 | 7220 | 7260 | 7340 | 7510 | 7520 | 7530 | 7600 | 7610 | 7620 | 7650 | 7660 | 7730 | 7760 | 7770 | ... | 1043480 | 1043500 | 1043510 | 1043560 | 1043580 | 1043610 | 1043680 | 1043730 | 1043740 | 1043890 | 1044170 | 1044200 | 1044240 | 1044340 | 1044350 | 1044450 | 1044530 | 1044630 | 1044640 | 1044770 | 1044830 | 1044840 | 1044920 | 1044950 | 1045020 | 1045080 | 1045220 | 1045300 | 1045530 | 1045740 | 1045850 | 1045930 | 1046030 | 1046070 | 1046110 | 1046230 | 1046240 | 1046330 | 1046370 | 1046430 | 1046490 | 1046530 | 1046560 | 1046670 | 1046750 | 1046820 | 1047120 | 1047140 | 1047160 | 1047190 | 1047240 | 1047670 | 1047720 | 1047780 | 1047910 | 1047960 | 1048000 | 1048040 | 1048080 | 1048100 | 1048170 | 1048260 | 1048410 | 1048470 | 1048570 | 1048640 | 1048830 | 1048850 | 1048920 | 1048960 | 1049070 | 1049080 | 1049090 | 1049140 | 1049230 | 1049270 | 1049370 | 1049420 | 1049660 | 1049680 | 1049730 | 1049800 | 1049840 | 1049910 | 1049930 | 1049950 | 1050010 | 1050150 | 1050190 | 1050210 | 1050230 | 1050240 | 1050430 | 1050470 | 1050690 | 1050730 | 1050760 | 1050870 | 1051130 | 1051160 | 1051170 | 1051250 | 1051280 | 1051310 | 1051500 | 1051530 | 1051810 | 1051830 | 1051840 | 1051890 | 1052010 | 1052070 | 1052220 | 1052480 | 1052760 | 1052850 | 1052870 | 1052900 | 1053040 | 1053060 | 1053090 | 1053160 | 1053190 | 1053250 | 1053300 | 1053650 | 1053660 | 1053680 | 1053730 | 1053740 | 1053780 | 1053960 | 1054240 | 1054250 | 1054560 | 1054650 | 1054750 | 1054790 | 1054900 | 1054930 | 1054980 | 1055000 | 1055090 | 1055140 | 1055430 | 1055620 | 1055690 | 1055770 | 1055890 | 1055970 | 1055990 | 1056260 | 1056470 | 1056660 | 1057180 | 1057390 | 1057420 | 1057430 | 1057460 | 1057500 | 1057660 | 1057690 | 1057710 | 1058150 | 1058350 | 1058430 | 1058590 | 1058660 | 1058910 | 1058930 | 1058940 | 1059090 | 1059190 | 1059280 | 1059500 | 1059710 | 1060030 | 1060110 | 1060170 | 1060300 | 1060440 | 1060770 | 1060870 | 1061230 | 1061470 | 1062120 | 1062670 | 1062880 | 1063060 | 1063230 | 1063560 | 1064060 | 1064580 | 1064890 | 1065160 | 1065230 | 1065570 | 1065650 | 1066700 | 1069460 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
des_score | 1.0 | 0.961512 | 0.945795 | 0.934382 | 0.943008 | 0.960876 | 0.957626 | 0.921382 | 0.939103 | 0.95993 | 0.921551 | 0.927461 | 0.94593 | 0.90659 | 0.898946 | 0.914862 | 0.911405 | 0.95099 | 0.931475 | 0.970041 | 0.962765 | 0.967606 | 0.970802 | 0.963702 | 0.954715 | 0.934344 | 0.943194 | 0.945811 | 0.961023 | 0.978808 | 0.925389 | 0.951446 | 0.974835 | 0.969522 | 0.906787 | 0.950787 | 0.940282 | 0.957382 | 0.963802 | 0.962142 | 0.960466 | 0.946349 | 0.942822 | 0.971871 | 0.967026 | 0.960562 | 0.95777 | 0.955703 | 0.950295 | 0.952328 | 0.958971 | 0.909103 | 0.969787 | 0.924597 | 0.95802 | 0.960333 | 0.948411 | 0.928311 | 0.962551 | 0.973973 | 0.969066 | 0.964309 | 0.942311 | 0.96016 | 0.948947 | 0.962135 | 0.966577 | 0.976587 | 0.951485 | 0.958576 | 0.97249 | 0.965662 | 0.949014 | 0.94475 | 0.957397 | 0.965393 | 0.956783 | 0.975227 | 0.930475 | 0.965291 | 0.955009 | 0.947873 | 0.961526 | 0.952512 | 0.950183 | 0.944986 | 0.962261 | 0.969044 | 0.975305 | 0.953375 | 0.962711 | 0.958304 | 0.961449 | 0.955212 | 0.956311 | 0.942178 | 0.958476 | 0.926531 | 0.942921 | 0.961367 | 0.968038 | 0.950098 | 0.955279 | 0.952408 | 0.960756 | 0.948867 | 0.952486 | 0.961579 | 0.956144 | 0.967617 | 0.943873 | 0.957578 | 0.950428 | 0.939503 | 0.947423 | 0.962539 | 0.930562 | 0.963361 | 0.959207 | 0.966648 | 0.959322 | 0.975176 | 0.959713 | 0.950691 | 0.910562 | 0.94832 | 0.963569 | 0.963666 | 0.958741 | 0.960476 | 0.970293 | 0.965641 | 0.974327 | 0.967185 | 0.966052 | 0.95497 | 0.965087 | 0.949277 | 0.928205 | 0.944663 | 0.958683 | 0.950566 | 0.925063 | 0.954232 | 0.95542 | 0.946145 | 0.964523 | 0.974514 | 0.957152 | 0.957979 | 0.946351 | 0.958036 | 0.948665 | 0.962144 | 0.947758 | 0.952939 | 0.960638 | 0.951802 | 0.945193 | 0.974822 | 0.950378 | 0.964237 | 0.950164 | 0.972334 | 0.963421 | 0.958661 | 0.958819 | 0.954291 | 0.96682 | 0.96698 | 0.953282 | 0.951441 | 0.975359 | 0.967495 | 0.958691 | 0.970512 | 0.971294 | 0.965467 | 0.964162 | 0.956259 | 0.960949 | 0.96019 | 0.970371 | 0.973522 | 0.956718 | 0.95578 | 0.952817 | 0.959254 | 0.945291 | 0.960193 | 0.967427 | 0.958603 | 0.962634 | 0.940812 | 0.955779 | 0.94324 | 0.94078 | 0.97287 | 0.961719 | 0.936183 | ... | 0.946043 | 0.94297 | 0.949489 | 0.971296 | 0.96426 | 0.966926 | 0.97363 | 0.957489 | 0.94 | 0.970431 | 0.950728 | 0.972359 | 0.828483 | 0.936045 | 0.963522 | 0.918857 | 0.957226 | 0.941465 | 0.960446 | 0.955229 | 0.955674 | 0.960985 | 0.969218 | 0.94959 | 0.924427 | 0.958533 | 0.957792 | 0.953473 | 0.959179 | 0.940672 | 0.942549 | 0.963387 | 0.974729 | 0.951037 | 0.934142 | 0.964091 | 0.964987 | 0.967064 | 0.932189 | 0.952246 | 0.939736 | 0.96545 | 0.953909 | 0.965528 | 0.966015 | 0.958385 | 0.940388 | 0.947768 | 0.950597 | 0.962264 | 0.96 | 0.941808 | 0.930306 | 0.956229 | 0.937603 | 0.912688 | 0.961058 | 0.943022 | 0.966474 | 0.959943 | 0.921762 | 0.944196 | 0.954443 | 0.949174 | 0.940707 | 0.953197 | 0.961353 | 0.948041 | 0.934947 | 0.973124 | 0.965572 | 0.934495 | 0.939686 | 0.969127 | 0.952988 | 0.961716 | 0.957197 | 0.960871 | 0.962724 | 0.938512 | 0.960901 | 0.964545 | 0.947015 | 0.968496 | 0.959844 | 0.974989 | 0.615458 | 0.930237 | 0.952555 | 0.971246 | 0.95678 | 0.944006 | 0.930249 | 0.948925 | 0.939268 | 0.959918 | 0.962569 | 0.90985 | 0.958129 | 0.952535 | 0.956871 | 0.961064 | 0.9572 | 0.950524 | 0.953567 | 0.961812 | 0.949973 | 0.953713 | 0.965139 | 0.949604 | 0.967367 | 0.968573 | 0.933704 | 0.957237 | 0.957139 | 0.967361 | 0.938544 | 0.95639 | 0.899245 | 0.948379 | 0.965855 | 0.94801 | 0.946661 | 0.947022 | 0.94445 | 0.95593 | 0.961782 | 0.935555 | 0.935236 | 0.969378 | 0.897731 | 0.965289 | 0.93819 | 0.951056 | 0.960344 | 0.927521 | 0.924759 | 0.964051 | 0.962288 | 0.943448 | 0.907275 | 0.961646 | 0.967157 | 0.971186 | 0.955137 | 0.827653 | 0.959458 | 0.952787 | 0.93116 | 0.945404 | 0.966081 | 0.963893 | 0.952557 | 0.945036 | 0.961546 | 0.966425 | 0.965785 | 0.944068 | 0.96518 | 0.909878 | 0.909632 | 0.968595 | 0.94873 | 0.934 | 0.962003 | 0.975232 | 0.961227 | 0.961177 | 0.964887 | 0.949574 | 0.952748 | 0.949352 | 0.877001 | 0.965642 | 0.928912 | 0.962623 | 0.940005 | 0.944509 | 0.966472 | 0.96168 | 0.935264 | 0.958787 | 0.951599 | 0.958628 | 0.959649 | 0.950458 | 0.963108 | 0.958878 | 0.962332 | 0.944474 | 0.941093 | 0.961861 | 0.929656 | 0.960278 | 0.963284 | 0.960613 | 0.959668 | 0.93736 | 0.965141 | 0.965985 |
1 rows × 26507 columns
# funtion that calculate the Jaccard similarity
def jaccard_similarity(target, row2):
intersection = len(set(target)&set(row2))
union = len(set(target).union(set(row2)))
return intersection / union
# Calculate the Jaccard similarity for one row instead of the whole matrix
def single_row_jaccard_similarity(df, target_row, title):
similarities = df.iloc[:,0].apply(lambda x: jaccard_similarity(
df.loc[target_row][0], x))
similarities = similarities.to_frame()
similarities.columns = [title]
return similarities
display(single_row_jaccard_similarity(gen_df, 10, 'gen_score').T)
display(single_row_jaccard_similarity(cat_df, 10, 'cat_score').T)
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 130 | 220 | 240 | 280 | 300 | 320 | 340 | 360 | 380 | 400 | 420 | 440 | 500 | 550 | 570 | 620 | 630 | 730 | 1002 | 1200 | 1250 | 1300 | 1500 | 1510 | 1520 | 1530 | 1600 | 1610 | 1630 | 1640 | 1670 | 1690 | 1700 | 1840 | 1900 | 1930 | 2100 | 2200 | 2210 | 2270 | 2280 | 2290 | 2300 | 2310 | 2320 | 2330 | 2340 | 2350 | 2360 | 2370 | 2390 | 2400 | 2450 | 2500 | 2520 | 2590 | 2600 | 2610 | 2620 | 2630 | 2640 | 2710 | 2720 | 2780 | 2800 | 2810 | 2820 | 2840 | 2850 | 2870 | 2900 | 2910 | 2920 | 2990 | 3010 | 3020 | 3050 | 3130 | 3170 | 3230 | 3260 | 3270 | 3330 | 3410 | 3480 | 3483 | 3490 | 3500 | 3510 | 3520 | 3530 | 3540 | 3560 | 3570 | 3580 | 3590 | 3600 | 3610 | 3620 | 3700 | 3710 | 3720 | 3730 | 3800 | 3810 | 3820 | 3830 | 3900 | 3910 | 3920 | 3960 | 3980 | 3990 | 4000 | 4100 | 4230 | 4290 | 4300 | 4420 | 4460 | 4470 | 4500 | 4520 | 4530 | 4560 | 4570 | 4580 | 4700 | 4720 | 4760 | 4770 | 4780 | 4800 | 4850 | 4870 | 4880 | 4890 | 4900 | 4920 | 6000 | 6010 | 6020 | 6030 | 6040 | 6060 | 6080 | 6090 | 6120 | 6200 | 6210 | 6220 | 6250 | 6270 | 6300 | 6310 | 6370 | 6400 | 6420 | 6510 | 6550 | 6800 | 6810 | 6830 | 6840 | 6850 | 6860 | 6870 | 6880 | 6900 | 6910 | 6920 | 6980 | 7000 | 7010 | 7020 | 7110 | 7200 | 7210 | 7220 | 7260 | 7340 | 7510 | 7520 | 7530 | 7600 | 7610 | 7620 | 7650 | 7660 | 7730 | 7760 | 7770 | ... | 1043480 | 1043500 | 1043510 | 1043560 | 1043580 | 1043610 | 1043680 | 1043730 | 1043740 | 1043890 | 1044170 | 1044200 | 1044240 | 1044340 | 1044350 | 1044450 | 1044530 | 1044630 | 1044640 | 1044770 | 1044830 | 1044840 | 1044920 | 1044950 | 1045020 | 1045080 | 1045220 | 1045300 | 1045530 | 1045740 | 1045850 | 1045930 | 1046030 | 1046070 | 1046110 | 1046230 | 1046240 | 1046330 | 1046370 | 1046430 | 1046490 | 1046530 | 1046560 | 1046670 | 1046750 | 1046820 | 1047120 | 1047140 | 1047160 | 1047190 | 1047240 | 1047670 | 1047720 | 1047780 | 1047910 | 1047960 | 1048000 | 1048040 | 1048080 | 1048100 | 1048170 | 1048260 | 1048410 | 1048470 | 1048570 | 1048640 | 1048830 | 1048850 | 1048920 | 1048960 | 1049070 | 1049080 | 1049090 | 1049140 | 1049230 | 1049270 | 1049370 | 1049420 | 1049660 | 1049680 | 1049730 | 1049800 | 1049840 | 1049910 | 1049930 | 1049950 | 1050010 | 1050150 | 1050190 | 1050210 | 1050230 | 1050240 | 1050430 | 1050470 | 1050690 | 1050730 | 1050760 | 1050870 | 1051130 | 1051160 | 1051170 | 1051250 | 1051280 | 1051310 | 1051500 | 1051530 | 1051810 | 1051830 | 1051840 | 1051890 | 1052010 | 1052070 | 1052220 | 1052480 | 1052760 | 1052850 | 1052870 | 1052900 | 1053040 | 1053060 | 1053090 | 1053160 | 1053190 | 1053250 | 1053300 | 1053650 | 1053660 | 1053680 | 1053730 | 1053740 | 1053780 | 1053960 | 1054240 | 1054250 | 1054560 | 1054650 | 1054750 | 1054790 | 1054900 | 1054930 | 1054980 | 1055000 | 1055090 | 1055140 | 1055430 | 1055620 | 1055690 | 1055770 | 1055890 | 1055970 | 1055990 | 1056260 | 1056470 | 1056660 | 1057180 | 1057390 | 1057420 | 1057430 | 1057460 | 1057500 | 1057660 | 1057690 | 1057710 | 1058150 | 1058350 | 1058430 | 1058590 | 1058660 | 1058910 | 1058930 | 1058940 | 1059090 | 1059190 | 1059280 | 1059500 | 1059710 | 1060030 | 1060110 | 1060170 | 1060300 | 1060440 | 1060770 | 1060870 | 1061230 | 1061470 | 1062120 | 1062670 | 1062880 | 1063060 | 1063230 | 1063560 | 1064060 | 1064580 | 1064890 | 1065160 | 1065230 | 1065570 | 1065650 | 1066700 | 1069460 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gen_score | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.5 | 1.0 | 1.0 | 0.333333 | 0.5 | 1.0 | 0.5 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.333333 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.333333 | 1.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.0 | 0.5 | 0.5 | 0.5 | 0.333333 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.333333 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.2 | 1.0 | 0.5 | 0.0 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.333333 | 0.5 | 0.5 | 0.0 | 0.5 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.5 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.25 | 0.5 | 0.5 | 0.0 | 0.333333 | 0.166667 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.5 | 0.166667 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.25 | 1.0 | 0.0 | 0.5 | 0.5 | 0.5 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.142857 | 0.0 | 1.0 | 0.333333 | 0.0 | 0.0 | 0.2 | 0.0 | 0.2 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.142857 | 0.0 | 0.0 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 0.0 | 0.2 | 0.333333 | 0.0 | 0.25 | 0.166667 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.1 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 0.333333 | 0.333333 | 0.333333 | 0.25 | 0.111111 | 0.25 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.2 | 0.0 | 0.333333 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 0.0 | 0.5 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.5 | 0.2 | 0.0 | 0.333333 | 0.0 | 0.125 | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.25 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.2 | 0.5 | 0.333333 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.5 | 0.2 | 0.0 | 0.0 | 0.0 | 0.5 | 0.25 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.2 | 0.0 | 0.0 | 0.0 | 0.25 | 0.2 | 0.0 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.333333 | 0.0 | 0.0 |
1 rows × 26507 columns
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 130 | 220 | 240 | 280 | 300 | 320 | 340 | 360 | 380 | 400 | 420 | 440 | 500 | 550 | 570 | 620 | 630 | 730 | 1002 | 1200 | 1250 | 1300 | 1500 | 1510 | 1520 | 1530 | 1600 | 1610 | 1630 | 1640 | 1670 | 1690 | 1700 | 1840 | 1900 | 1930 | 2100 | 2200 | 2210 | 2270 | 2280 | 2290 | 2300 | 2310 | 2320 | 2330 | 2340 | 2350 | 2360 | 2370 | 2390 | 2400 | 2450 | 2500 | 2520 | 2590 | 2600 | 2610 | 2620 | 2630 | 2640 | 2710 | 2720 | 2780 | 2800 | 2810 | 2820 | 2840 | 2850 | 2870 | 2900 | 2910 | 2920 | 2990 | 3010 | 3020 | 3050 | 3130 | 3170 | 3230 | 3260 | 3270 | 3330 | 3410 | 3480 | 3483 | 3490 | 3500 | 3510 | 3520 | 3530 | 3540 | 3560 | 3570 | 3580 | 3590 | 3600 | 3610 | 3620 | 3700 | 3710 | 3720 | 3730 | 3800 | 3810 | 3820 | 3830 | 3900 | 3910 | 3920 | 3960 | 3980 | 3990 | 4000 | 4100 | 4230 | 4290 | 4300 | 4420 | 4460 | 4470 | 4500 | 4520 | 4530 | 4560 | 4570 | 4580 | 4700 | 4720 | 4760 | 4770 | 4780 | 4800 | 4850 | 4870 | 4880 | 4890 | 4900 | 4920 | 6000 | 6010 | 6020 | 6030 | 6040 | 6060 | 6080 | 6090 | 6120 | 6200 | 6210 | 6220 | 6250 | 6270 | 6300 | 6310 | 6370 | 6400 | 6420 | 6510 | 6550 | 6800 | 6810 | 6830 | 6840 | 6850 | 6860 | 6870 | 6880 | 6900 | 6910 | 6920 | 6980 | 7000 | 7010 | 7020 | 7110 | 7200 | 7210 | 7220 | 7260 | 7340 | 7510 | 7520 | 7530 | 7600 | 7610 | 7620 | 7650 | 7660 | 7730 | 7760 | 7770 | ... | 1043480 | 1043500 | 1043510 | 1043560 | 1043580 | 1043610 | 1043680 | 1043730 | 1043740 | 1043890 | 1044170 | 1044200 | 1044240 | 1044340 | 1044350 | 1044450 | 1044530 | 1044630 | 1044640 | 1044770 | 1044830 | 1044840 | 1044920 | 1044950 | 1045020 | 1045080 | 1045220 | 1045300 | 1045530 | 1045740 | 1045850 | 1045930 | 1046030 | 1046070 | 1046110 | 1046230 | 1046240 | 1046330 | 1046370 | 1046430 | 1046490 | 1046530 | 1046560 | 1046670 | 1046750 | 1046820 | 1047120 | 1047140 | 1047160 | 1047190 | 1047240 | 1047670 | 1047720 | 1047780 | 1047910 | 1047960 | 1048000 | 1048040 | 1048080 | 1048100 | 1048170 | 1048260 | 1048410 | 1048470 | 1048570 | 1048640 | 1048830 | 1048850 | 1048920 | 1048960 | 1049070 | 1049080 | 1049090 | 1049140 | 1049230 | 1049270 | 1049370 | 1049420 | 1049660 | 1049680 | 1049730 | 1049800 | 1049840 | 1049910 | 1049930 | 1049950 | 1050010 | 1050150 | 1050190 | 1050210 | 1050230 | 1050240 | 1050430 | 1050470 | 1050690 | 1050730 | 1050760 | 1050870 | 1051130 | 1051160 | 1051170 | 1051250 | 1051280 | 1051310 | 1051500 | 1051530 | 1051810 | 1051830 | 1051840 | 1051890 | 1052010 | 1052070 | 1052220 | 1052480 | 1052760 | 1052850 | 1052870 | 1052900 | 1053040 | 1053060 | 1053090 | 1053160 | 1053190 | 1053250 | 1053300 | 1053650 | 1053660 | 1053680 | 1053730 | 1053740 | 1053780 | 1053960 | 1054240 | 1054250 | 1054560 | 1054650 | 1054750 | 1054790 | 1054900 | 1054930 | 1054980 | 1055000 | 1055090 | 1055140 | 1055430 | 1055620 | 1055690 | 1055770 | 1055890 | 1055970 | 1055990 | 1056260 | 1056470 | 1056660 | 1057180 | 1057390 | 1057420 | 1057430 | 1057460 | 1057500 | 1057660 | 1057690 | 1057710 | 1058150 | 1058350 | 1058430 | 1058590 | 1058660 | 1058910 | 1058930 | 1058940 | 1059090 | 1059190 | 1059280 | 1059500 | 1059710 | 1060030 | 1060110 | 1060170 | 1060300 | 1060440 | 1060770 | 1060870 | 1061230 | 1061470 | 1062120 | 1062670 | 1062880 | 1063060 | 1063230 | 1063560 | 1064060 | 1064580 | 1064890 | 1065160 | 1065230 | 1065570 | 1065650 | 1066700 | 1069460 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cat_score | 1.0 | 1.0 | 0.333333 | 1.0 | 0.25 | 0.666667 | 0.5 | 0.25 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.333333 | 0.333333 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.2 | 0.2 | 0.25 | 0.0 | 0.2 | 0.333333 | 0.25 | 0.333333 | 0.2 | 0.0 | 0.0 | 0.0 | 0.25 | 0.25 | 0.25 | 0.25 | 0.2 | 0.2 | 0.25 | 0.25 | 0.0 | 0.0 | 0.2 | 0.4 | 0.25 | 0.25 | 0.25 | 0.0 | 0.0 | 0.0 | 0.25 | 0.2 | 0.2 | 0.25 | 0.25 | 0.25 | 0.0 | 0.0 | 0.25 | 0.25 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.25 | 0.25 | 0.25 | 0.25 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.0 | 0.2 | 0.0 | 0.25 | 0.25 | 0.25 | 0.0 | 0.0 | 0.25 | 0.0 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 0.0 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.25 | 0.0 | 0.25 | 0.0 | 0.0 | 0.2 | 0.0 | 0.25 | 0.25 | 0.0 | 0.0 | 0.0 | 0.25 | 0.25 | 0.2 | 0.2 | 0.25 | 0.25 | 0.25 | 0.25 | 0.0 | 0.25 | 0.25 | 0.25 | 0.25 | 0.25 | 0.0 | 0.0 | 0.75 | 0.0 | 0.666667 | 0.5 | 0.0 | 0.5 | 0.5 | 0.0 | 0.5 | 0.142857 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.25 | 0.25 | 0.0 | 0.0 | 0.2 | 0.166667 | 0.25 | 0.25 | 0.0 | 0.2 | 0.2 | 0.2 | 0.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.0 | 0.0 | 0.0 | 0.25 | 0.2 | 0.0 | 0.25 | 0.0 | 0.0 | 0.25 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.25 | 0.25 | 0.0 | 0.0 | 0.0 | 0.25 | 0.0 | 0.25 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.285714 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.0 | 0.0 | 0.4 | 0.0 | 0.0 | 0.2 | 0.0 | 0.0 | 0.25 | 0.0 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.0 | 0.25 | 0.285714 | 0.0 | 0.0 | 0.666667 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.4 | 0.333333 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.166667 | 0.0 | 0.0 |
1 rows × 26507 columns
# smoting needs to be between 0 and 1 in order to transform the linear value in a curve with slope to top left
def date_score(df, index, smooting):
target = df.loc[index].values[0]
df = (1-abs(df-target))**smooting
df.columns = ['date_score']
return df
date_score(date_df, 10, 0.8).T
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 130 | 220 | 240 | 280 | 300 | 320 | 340 | 360 | 380 | 400 | 420 | 440 | 500 | 550 | 570 | 620 | 630 | 730 | 1002 | 1200 | 1250 | 1300 | 1500 | 1510 | 1520 | 1530 | 1600 | 1610 | 1630 | 1640 | 1670 | 1690 | 1700 | 1840 | 1900 | 1930 | 2100 | 2200 | 2210 | 2270 | 2280 | 2290 | 2300 | 2310 | 2320 | 2330 | 2340 | 2350 | 2360 | 2370 | 2390 | 2400 | 2450 | 2500 | 2520 | 2590 | 2600 | 2610 | 2620 | 2630 | 2640 | 2710 | 2720 | 2780 | 2800 | 2810 | 2820 | 2840 | 2850 | 2870 | 2900 | 2910 | 2920 | 2990 | 3010 | 3020 | 3050 | 3130 | 3170 | 3230 | 3260 | 3270 | 3330 | 3410 | 3480 | 3483 | 3490 | 3500 | 3510 | 3520 | 3530 | 3540 | 3560 | 3570 | 3580 | 3590 | 3600 | 3610 | 3620 | 3700 | 3710 | 3720 | 3730 | 3800 | 3810 | 3820 | 3830 | 3900 | 3910 | 3920 | 3960 | 3980 | 3990 | 4000 | 4100 | 4230 | 4290 | 4300 | 4420 | 4460 | 4470 | 4500 | 4520 | 4530 | 4560 | 4570 | 4580 | 4700 | 4720 | 4760 | 4770 | 4780 | 4800 | 4850 | 4870 | 4880 | 4890 | 4900 | 4920 | 6000 | 6010 | 6020 | 6030 | 6040 | 6060 | 6080 | 6090 | 6120 | 6200 | 6210 | 6220 | 6250 | 6270 | 6300 | 6310 | 6370 | 6400 | 6420 | 6510 | 6550 | 6800 | 6810 | 6830 | 6840 | 6850 | 6860 | 6870 | 6880 | 6900 | 6910 | 6920 | 6980 | 7000 | 7010 | 7020 | 7110 | 7200 | 7210 | 7220 | 7260 | 7340 | 7510 | 7520 | 7530 | 7600 | 7610 | 7620 | 7650 | 7660 | 7730 | 7760 | 7770 | ... | 1043480 | 1043500 | 1043510 | 1043560 | 1043580 | 1043610 | 1043680 | 1043730 | 1043740 | 1043890 | 1044170 | 1044200 | 1044240 | 1044340 | 1044350 | 1044450 | 1044530 | 1044630 | 1044640 | 1044770 | 1044830 | 1044840 | 1044920 | 1044950 | 1045020 | 1045080 | 1045220 | 1045300 | 1045530 | 1045740 | 1045850 | 1045930 | 1046030 | 1046070 | 1046110 | 1046230 | 1046240 | 1046330 | 1046370 | 1046430 | 1046490 | 1046530 | 1046560 | 1046670 | 1046750 | 1046820 | 1047120 | 1047140 | 1047160 | 1047190 | 1047240 | 1047670 | 1047720 | 1047780 | 1047910 | 1047960 | 1048000 | 1048040 | 1048080 | 1048100 | 1048170 | 1048260 | 1048410 | 1048470 | 1048570 | 1048640 | 1048830 | 1048850 | 1048920 | 1048960 | 1049070 | 1049080 | 1049090 | 1049140 | 1049230 | 1049270 | 1049370 | 1049420 | 1049660 | 1049680 | 1049730 | 1049800 | 1049840 | 1049910 | 1049930 | 1049950 | 1050010 | 1050150 | 1050190 | 1050210 | 1050230 | 1050240 | 1050430 | 1050470 | 1050690 | 1050730 | 1050760 | 1050870 | 1051130 | 1051160 | 1051170 | 1051250 | 1051280 | 1051310 | 1051500 | 1051530 | 1051810 | 1051830 | 1051840 | 1051890 | 1052010 | 1052070 | 1052220 | 1052480 | 1052760 | 1052850 | 1052870 | 1052900 | 1053040 | 1053060 | 1053090 | 1053160 | 1053190 | 1053250 | 1053300 | 1053650 | 1053660 | 1053680 | 1053730 | 1053740 | 1053780 | 1053960 | 1054240 | 1054250 | 1054560 | 1054650 | 1054750 | 1054790 | 1054900 | 1054930 | 1054980 | 1055000 | 1055090 | 1055140 | 1055430 | 1055620 | 1055690 | 1055770 | 1055890 | 1055970 | 1055990 | 1056260 | 1056470 | 1056660 | 1057180 | 1057390 | 1057420 | 1057430 | 1057460 | 1057500 | 1057660 | 1057690 | 1057710 | 1058150 | 1058350 | 1058430 | 1058590 | 1058660 | 1058910 | 1058930 | 1058940 | 1059090 | 1059190 | 1059280 | 1059500 | 1059710 | 1060030 | 1060110 | 1060170 | 1060300 | 1060440 | 1060770 | 1060870 | 1061230 | 1061470 | 1062120 | 1062670 | 1062880 | 1063060 | 1063230 | 1063560 | 1064060 | 1064580 | 1064890 | 1065160 | 1065230 | 1065570 | 1065650 | 1066700 | 1069460 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
date_score | 1.0 | 0.941382 | 0.90752 | 0.978676 | 0.963114 | 1.0 | 0.926688 | 0.876035 | 0.978676 | 0.848971 | 0.850538 | 0.866483 | 0.62536 | 0.850538 | 0.812721 | 0.79301 | 0.789713 | 0.736476 | 0.736476 | 0.736476 | 0.692394 | 0.651732 | 0.498678 | 0.59357 | 0.624571 | 0.53676 | 0.814305 | 0.798108 | 0.672749 | 0.792053 | 0.823795 | 0.780868 | 0.776917 | 0.698874 | 0.801821 | 0.801821 | 0.785986 | 0.785986 | 0.783855 | 0.7751 | 0.756979 | 0.541676 | 0.796197 | 0.674299 | 0.774138 | 0.74383 | 0.581195 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.74383 | 0.785453 | 0.613034 | 0.792265 | 0.76825 | 0.733442 | 0.758269 | 0.775421 | 0.775421 | 0.775421 | 0.775421 | 0.719752 | 0.746205 | 0.719752 | 0.784388 | 0.784388 | 0.69591 | 0.615414 | 0.615414 | 0.483219 | 0.774031 | 0.774031 | 0.774031 | 0.768036 | 0.780761 | 0.733334 | 0.696789 | 0.653741 | 0.651732 | 0.743506 | 0.74102 | 0.734309 | 0.780121 | 0.780121 | 0.759989 | 0.739614 | 0.74804 | 0.735826 | 0.729537 | 0.71048 | 0.675959 | 0.69602 | 0.692284 | 0.684464 | 0.682147 | 0.673746 | 0.675959 | 0.66454 | 0.658979 | 0.665762 | 0.678171 | 0.668426 | 0.64536 | 0.778519 | 0.778519 | 0.778519 | 0.775635 | 0.774138 | 0.774138 | 0.824111 | 0.774138 | 0.756656 | 0.756656 | 0.770393 | 0.776276 | 0.770928 | 0.748363 | 0.772747 | 0.758484 | 0.697777 | 0.705778 | 0.758484 | 0.745666 | 0.745666 | 0.745666 | 0.743398 | 0.743398 | 0.771891 | 0.528658 | 0.741128 | 0.741128 | 0.741128 | 0.771142 | 0.619488 | 0.578666 | 0.578666 | 0.578666 | 0.768786 | 0.52854 | 0.66665 | 0.66665 | 0.658867 | 0.658867 | 0.66665 | 0.66665 | 0.66665 | 0.66665 | 0.613488 | 0.768036 | 0.768036 | 0.763425 | 0.759237 | 0.738532 | 0.765678 | 0.753966 | 0.571409 | 0.757732 | 0.747177 | 0.747608 | 0.749442 | 0.759022 | 0.759022 | 0.759022 | 0.759022 | 0.759022 | 0.759022 | 0.759022 | 0.759022 | 0.759022 | 0.757517 | 0.757517 | 0.757517 | 0.757517 | 0.757517 | 0.757517 | 0.760741 | 0.715938 | 0.759129 | 0.759129 | 0.737017 | 0.634593 | 0.674299 | 0.602013 | 0.587047 | 0.753643 | 0.753643 | 0.753643 | 0.753643 | 0.700519 | 0.700519 | 0.700519 | 0.700519 | ... | 0.228949 | 0.228078 | 0.228804 | 0.22648 | 0.224879 | 0.228659 | 0.228078 | 0.228368 | 0.228804 | 0.225607 | 0.227933 | 0.227207 | 0.228659 | 0.226916 | 0.22648 | 0.228078 | 0.228659 | 0.228659 | 0.228804 | 0.228078 | 0.223713 | 0.22648 | 0.228804 | 0.228078 | 0.224733 | 0.226625 | 0.226625 | 0.227497 | 0.225753 | 0.226625 | 0.228078 | 0.227788 | 0.226625 | 0.227933 | 0.227788 | 0.227788 | 0.225753 | 0.228223 | 0.223713 | 0.227207 | 0.225462 | 0.223859 | 0.226044 | 0.227933 | 0.227497 | 0.226771 | 0.226044 | 0.226335 | 0.227061 | 0.225898 | 0.224588 | 0.227933 | 0.225753 | 0.227061 | 0.224588 | 0.227643 | 0.223859 | 0.226335 | 0.227788 | 0.227497 | 0.227061 | 0.225753 | 0.227207 | 0.227643 | 0.223859 | 0.227788 | 0.225462 | 0.225607 | 0.228223 | 0.225607 | 0.226044 | 0.22648 | 0.226335 | 0.226044 | 0.227643 | 0.224879 | 0.225607 | 0.225025 | 0.223859 | 0.226771 | 0.225607 | 0.227061 | 0.227788 | 0.225462 | 0.227352 | 0.225607 | 0.226044 | 0.224296 | 0.224296 | 0.227643 | 0.227643 | 0.227497 | 0.225753 | 0.227497 | 0.224733 | 0.227497 | 0.226625 | 0.226771 | 0.22648 | 0.22648 | 0.226044 | 0.226771 | 0.224588 | 0.224442 | 0.225025 | 0.224442 | 0.224442 | 0.226916 | 0.224733 | 0.226044 | 0.225607 | 0.224442 | 0.224879 | 0.224296 | 0.226916 | 0.225607 | 0.225753 | 0.226044 | 0.224733 | 0.225607 | 0.226044 | 0.224442 | 0.224442 | 0.225607 | 0.225607 | 0.225607 | 0.22648 | 0.224733 | 0.224879 | 0.224588 | 0.226771 | 0.22648 | 0.22648 | 0.226625 | 0.22648 | 0.224588 | 0.226044 | 0.224588 | 0.225462 | 0.225607 | 0.224296 | 0.224733 | 0.224588 | 0.225025 | 0.224588 | 0.225462 | 0.22648 | 0.225607 | 0.225607 | 0.224733 | 0.225025 | 0.225753 | 0.225462 | 0.224296 | 0.224879 | 0.224442 | 0.224005 | 0.224879 | 0.224588 | 0.224588 | 0.225607 | 0.225607 | 0.223713 | 0.22517 | 0.225316 | 0.224442 | 0.225025 | 0.225462 | 0.224296 | 0.223713 | 0.226916 | 0.224005 | 0.224879 | 0.224733 | 0.225316 | 0.226771 | 0.224588 | 0.224442 | 0.224005 | 0.223713 | 0.224442 | 0.223713 | 0.224442 | 0.224879 | 0.224442 | 0.224588 | 0.224442 | 0.224588 | 0.224005 | 0.224442 | 0.223713 | 0.223713 | 0.223859 | 0.223859 | 0.223859 | 0.223713 | 0.223859 | 0.223713 | 0.224733 | 0.223713 |
1 rows × 26507 columns
In this section, I created five functions to get similar games based on features and scores.
This function takes the target game index and the stored dataframes as input, and calculates the scores for each feature, such as developers, publishers, genres, categories, tags, ratings, release dates, and descriptions. It then joins these scores into a single DataFrame and returns it.
This function combines all the scores from the first function into a single score by assigning different weights to each feature based on their importance. These weights were determined through knowledge of video games, discussions with gamers, and testing. The function then sorts the games by their total score and returns the top 30 similar games.
The weights in this function are designed to give more importance to games released around the same time as the target game. This is based on the assumption that users looking for a specific game might be more interested in other games from the same time period.
This function is similar to the previous one but adjusts the weights to give more importance to the release date, specifically favoring newer games. This function was created to ensure that the recommendation system also includes new games for users to discover, regardless of the target game's release date. The other weights in this function were fine-tuned through extensive testing with known games.
This function combines the results from the get_score
and get_score_new_games
functions by selecting the top 4 similar games and 3 random games from the top 30, and then removing any common games from the similar recent games list. It then adds the top 3 new games and 2 random games from the new games list, resulting in a total of 12 recommended games, which is the same number of games Steam displays on their website. The function then shuffles the results randomly and returns them.
This function takes a game index as input and calls all the previous functions to provide the 12 recommended games in a single line of code. The final recommendations include a mix of similar games, random selections from the top matches, and a focus on both new and older games to provide a diverse set of recommendations for the user.
def score_tab(target,
scores_df = stored_tables['scores_df'],
dev_df = stored_tables['dev_df'],
pub_df = stored_tables['pub_df'],
tag_df = stored_tables['tag_df'],
gen_df = stored_tables['gen_df'],
cat_df = stored_tables['cat_df'],
rating_df = stored_tables['rating_df'],
date_df = stored_tables['date_df'],
des_vectors_df = stored_tables['des_vectors_df']):
scores= scores_df.join([
get_dev_scores(dev_df, target),
get_pub_scores(pub_df, target),
single_row_jaccard_similarity(gen_df, target, 'gen_score'),
single_row_jaccard_similarity(cat_df, target, 'cat_score'),
single_row_cosine_similarity(tag_df,target, 'tag_score'),
rating_df,
date_score(date_df, target, 0.7),
single_row_cosine_similarity(des_vectors_df,target, 'des_score'),
]).drop(target, axis=0)
return scores
def get_score(scores, top_val=30):
multipliers = {'dev_score': 5,
'pub_score': 6,
'gen_score': 7,
'cat_score': 8,
'tag_score': 10,
'weighted_rating': 35,
'date_score': 9,
'des_score': 20}
scores = scores.mul(pd.Series(multipliers), axis=1)
score = pd.DataFrame(scores.sum(axis=1), columns=['SCORE'])
score = score.sort_values(by='SCORE', ascending=False)
return score.head(top_val)/100
# elevating by a number bigger then 1 the days_norm will became a ripid curve,
# decreasing the scores of values more distant to 1
def get_score_new_games(scores, date_df = stored_tables['date_df'], top_val=30):
scores['date_score'] = (1-date_df['days_norm'])**5
multipliers = {'dev_score': 6,
'pub_score': 7,
'gen_score': 5,
'cat_score': 6,
'tag_score': 10,
'weighted_rating': 10,
'date_score': 50,
'des_score': 1}
scores = scores.mul(pd.Series(multipliers), axis=1)
score = pd.DataFrame(scores.sum(axis=1), columns=['SCORE'])
score = score.sort_values(by='SCORE', ascending=False)
return score.head(top_val)/100
def get_results(score, score_new_games, drop_score=True, add_titles=True, title=stored_tables['title']):
result = pd.concat([score[:4],
score[4:].sample(3)],
axis=0)
score_new_games = score_new_games.drop(index=result.index, errors='ignore')
new_games_results = pd.concat([score_new_games[:3],
score_new_games[3:].sample(2)],
axis=0)
result = pd.concat([result, new_games_results], axis=0).sample(frac=1)
if add_titles:
result = result.join(title)
if drop_score:
result = result.drop(columns=['SCORE'])
return result
def get_recommendations(target, print_target_game=False, drop_score=True, add_titles=True):
if print_target_game:
print('Target game:', name_df.loc[target][0])
scores = score_tab(target)
score = get_score(scores)
score_new_games = get_score_new_games(scores)
return get_results(score, score_new_games, drop_score, add_titles)
# Known games' indexes for test: 10, 45700, 6550, 566050, 289070
# Remove comment to the second target assigment to test on a random game
target = 289070
# target = df.sample().index[0]
print('Target:', name_df.loc[target][0])
print('index:', target)
Target: Sid Meier’s Civilization® VI index: 289070
get_recommendations(target)
name | |
---|---|
8930 | Sid Meier's Civilization® V |
1005930 | Timeflow – Time and Money Simulator |
200510 | XCOM: Enemy Unknown |
8800 | Civilization IV: Beyond the Sword |
4700 | Total War: MEDIEVAL II – Definitive Edition |
965240 | Akabeth Tactics |
1058590 | Franchise Wars |
977690 | Skyworld: Kingdom Brawl |
603850 | Age of Civilizations II |
48950 | Greed Corp |
4580 | Warhammer® 40,000: Dawn of War® - Dark Crusade |
3900 | Sid Meier's Civilization® IV |
scores = score_tab(target)
scores.sample(5)
dev_score | pub_score | gen_score | cat_score | tag_score | weighted_rating | date_score | des_score | |
---|---|---|---|---|---|---|---|---|
264560 | 0.0 | 0.0 | 0.000000 | 0.5 | 0.000000 | 0.679699 | 0.925683 | 0.943874 |
850690 | 0.0 | 0.0 | 0.000000 | 0.5 | 0.000000 | 0.250000 | 0.946092 | 0.955254 |
317940 | 0.0 | 0.0 | 0.000000 | 0.5 | 0.026965 | 0.431868 | 0.952733 | 0.970279 |
385350 | 0.0 | 0.0 | 0.333333 | 0.5 | 0.320498 | 0.442105 | 0.984058 | 0.969294 |
316370 | 0.0 | 0.0 | 0.000000 | 0.5 | 0.000000 | 0.334091 | 0.969073 | 0.974595 |
s = get_score(scores)
s.head().join(name_df)
SCORE | name | |
---|---|---|
8930 | 0.933661 | Sid Meier's Civilization® V |
3900 | 0.872145 | Sid Meier's Civilization® IV |
200510 | 0.871319 | XCOM: Enemy Unknown |
8800 | 0.848674 | Civilization IV: Beyond the Sword |
3910 | 0.843349 | Sid Meier's Civilization® III Complete |
s_new_games = get_score_new_games(scores)
s_new_games.join(name_df).join(stored_tables['rel_date']).head()
SCORE | name | release_date | |
---|---|---|---|
1058590 | 0.700207 | Franchise Wars | 2019-04-15 |
965240 | 0.699047 | Akabeth Tactics | 2019-04-11 |
603850 | 0.695300 | Age of Civilizations II | 2018-11-21 |
798510 | 0.685977 | SUPER DRAGON BALL HEROES WORLD MISSION | 2019-04-04 |
1046030 | 0.685899 | ISLANDERS | 2019-04-04 |
get_results(s, s_new_games, drop_score=False, add_titles=True)
SCORE | name | |
---|---|---|
603850 | 0.695300 | Age of Civilizations II |
1058590 | 0.700207 | Franchise Wars |
3900 | 0.872145 | Sid Meier's Civilization® IV |
607050 | 0.656442 | Wargroove |
8930 | 0.933661 | Sid Meier's Civilization® V |
65980 | 0.782468 | Sid Meier's Civilization®: Beyond Earth™ |
221380 | 0.809435 | Age of Empires II HD |
40970 | 0.787063 | Stronghold Crusader HD |
200510 | 0.871319 | XCOM: Enemy Unknown |
8800 | 0.848674 | Civilization IV: Beyond the Sword |
965240 | 0.699047 | Akabeth Tactics |
955410 | 0.654194 | Armoured Alliance |
In this game recommendation system, a variety of techniques ad been used to find and recommend similar games for users.
Some of the key techniques and their advantages are:
This recommendations system acheived comparable results to the ones suggested by Steam on their website and some suggested games are shared. The system's running time is approximately 1.45 seconds when the precomputed tables are provided.
However, there are some limitations, such as time constraints, limited resources, and access to more data that could potentially improve the recommendations.
Possible expansions for this recommendation system include:
By addressing these limitations and implementing the suggested expansions, it would be possible to further improve the accuracy and effectiveness of the game recommendation system, providing users with an even better experience when discovering new games to play.