# python: breaking age group by average number of friends

i have a dataframe of with 4 attributes, it can be seen blow. what i wanted to do it that take the name and age of a person and count the number of friends he has. then of two ppl have the same age with different names, take the average number of friends for that age group. final divide the age range into age group and then take the average. this is how i tried.

```#loc the attribute or features of interest
friends = df.iloc[:,3]
ages = df.iloc[:,2]
default of dictionary with age as key and value as a list of friends
dictionary_age_friends = defaultdict(list)
populating the dictionary with key age and values friend
for i,j in zip(ages,friends):
dictionary_age_friends[i].append(j)
print(“first dict”)
print(dictionary_age_friends)
#second dictionary, the same age is collected and the number of friends is added
set_dict ={}
for x in dictionary_age_friends:
list_friends =[]
for y in dictionary_age_friends[x]:
list_friends.append(y)
set_list_len = len(list_friends) # assign a friend with a number 1
set_dict[x] = set_list_len
print(set_dict)
set_dict ={}
for x in dictionary_age_friends:
print(“inside the loop”)
lis_1 =[]
for y in dictionary_age_friends[x]:
lis_1.append(y)
set_list = lis_1
set_list = [1 for x in set_list] # assign a friend with a number 1
set_dict[x] = sum(set_list)
a dictionary that assign the age range into age-groups
second_dict = defaultdict(list)
for i,j in set_dict.items():
if i in range(16,20):
second_dict[i].append(j)
elif i in range(20,40):
second_dict[i].append(j)
elif i in  range(40,60):
i =“MiddleAge”
second_dict[i].append(j)
elif i in range(60,72):
i = “old”
second_dict[i].append(j)
print(second_dict)
print(“final dict stared”)
new_dic ={}
for key,value in second_dict.items():
new_dic[key] = round((sum(value)/len(value)),2)
new_dic[key] = round((sum(value)/len(value)),2)
elif key ==‘MiddleAge’ :
new_dic[key] = round((sum(value)/len(value)),2)
else:
new_dic[key] = round((sum(value)/len(value)),2)
new_dic
end_time = datetime.datetime.now()
print(end_time-start_time)
print(new_dic)
```

some of the feedback i got is: 1, no need to build a list if u want just to count number of friends. 2, two ppl with the same age, 18. One has 4 friends, the other 3. the current code conclude that there are 7 average friends. 3, the code is not correct and optimal.

any suggestions or help? thanks indavance for all suggestion or helps?

## Buddha Community 1548213757

I haven’t understood names of attributes and you haven’t mention by which age groups you need to split your data. In my answer I’ll treat the data as if the attributes were:

``````index, name, age, friend

``````

To find amount of friends by name, I would suggest you to use groupby.

input:

``````groups = df.groupby([df.iloc[:,0],df.iloc[:,1]]) # grouping by name(0), age(1)
amount_of_friends_df = groups.size() # gathering amount of friends for a person
print(amount_of_friends_df)

``````

output:

``````name  age
EUNK  25     1
FBFM  26     1
MYYD  30     1
OBBF  28     2
RJCW  25     1
RQTI  21     1
VLIP  16     1
ZCWQ  18     1
ZMQE  27     1

``````

To find amount of friends by age you also can use groups

input:

``````groups = df.groupby([df.iloc[:,1]]) # groups by age(1)
age_friends = groups.size()
age_friends=age_friends.reset_index()
age_friends.columns=(['age','amount_of_friends'])
print(age_friends)

``````

output:

``````    age  amount_of_friends
0   16                  1
1   18                  1
2   21                  1
3   25                  2
4   26                  1
5   27                  1
6   28                  2
7   30                  1

``````

To calculate average amount of friends per age group you can use categories and groupby.

input:

``````mean_by_age_group_df = age_friends.groupby(pd.cut(age_friends.age,[20,40,60,72]))\
.agg({'amount_of_friends':'mean'})
print(mean_by_age_group_df)

``````

pd.cut returns caregorical series which we use to group data. Afterwards we use agg function to aggregate groups in dataframe.

output:

``````          amount_of_friends
age
(20, 40]           1.333333
(40, 60]                NaN
(60, 72]                NaN

`````` 1619518440

