-
amirali1985/convsersations_corrigible_more_llama3.2-1B-it_large_with_curvature
Viewer • Updated • 8.27k • 17 -
amirali1985/convsersations_power_seeking_llama3.2-1B-it_large_with_curvature
Viewer • Updated • 8.27k • 15 -
amirali1985/convsersations_self_awareness_general_llama3.2-1B-it_large_with_curvature
Viewer • Updated • 10k • 16 -
amirali1985/convsersations_sadness_llama3.2-1B-it_large_with_curvature
Viewer • Updated • 9.78k • 14
Abdullah
amirali1985
AI & ML interests
Mechanistic interpretability, high dimensional geometry, persona role playing.
Recent Activity
liked
a dataset 1 day ago
thoughtworks/gemma_psychometrics_personas_responses updated
a collection
1 day ago
Psychometrics Resources updated
a collection
1 day ago
Psychometrics Resources Organizations
steering_with_curvature_metrics
-
amirali1985/convsersations_corrigible_more_llama3.2-1B-it_large_with_curvature
Viewer • Updated • 8.27k • 17 -
amirali1985/convsersations_power_seeking_llama3.2-1B-it_large_with_curvature
Viewer • Updated • 8.27k • 15 -
amirali1985/convsersations_self_awareness_general_llama3.2-1B-it_large_with_curvature
Viewer • Updated • 10k • 16 -
amirali1985/convsersations_sadness_llama3.2-1B-it_large_with_curvature
Viewer • Updated • 9.78k • 14
models 15
amirali1985/interpreting_reward_models
Updated
amirali1985/gpt-neo-125m_hh_reward
Text Generation • 0.1B • Updated
• 4
amirali1985/gpt-neo-125m_utility_reward
Reinforcement Learning • Updated
• 3
amirali1985/pythia-70m_sentiment_reward
Reinforcement Learning • Updated
• 3
amirali1985/pythia-160m_sentiment_reward
Reinforcement Learning • Updated
• 7
amirali1985/gpt-neo-125m_sentiment_reward
Reinforcement Learning • Updated
• 2
amirali1985/pythia-160m_utility_reward
Reinforcement Learning • Updated
• 4
amirali1985/pythia-70m_utility_reward
Reinforcement Learning • 70.4M • Updated
• 3
amirali1985/gpt-j-6b-sharded-bf16_sentiment_reward
Reinforcement Learning • Updated
amirali1985/pythia-410m_utility_reward
Reinforcement Learning • Updated
datasets 15
amirali1985/convsersations_humor_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 9.63k • 34
amirali1985/convsersations_wealth_seeking_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 9.5k • 14
amirali1985/convsersations_excitement_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 8k • 15
amirali1985/convsersations_rude_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 10k • 15
amirali1985/convsersations_sycophancy_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 9.61k • 20
amirali1985/convsersations_sadness_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 9.78k • 14
amirali1985/convsersations_self_awareness_general_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 10k • 16
amirali1985/convsersations_power_seeking_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 8.27k • 15
amirali1985/convsersations_corrigible_more_llama3.2-1B-it_large_with_curvature
Viewer
• Updated
• 8.27k • 17
amirali1985/convsersations_corrigible_more_llama3.2-1B-it_large_with_curvature_geom
Viewer
• Updated
• 8.27k • 13