|
ShodhKosh: Journal of Visual and Performing ArtsISSN (Online): 2582-7472
AI-Generated Photobooks for Educational Use Fehmina Khalique 1 1 Lloyd Law College, Greater Noida, Uttar Pradesh 201306, India 2 Assistant
Professor, School of Sciences, Noida International University,203201, India 3 Chitkara Centre for Research and Development, Chitkara University,
Himachal Pradesh, Solan, 174103, India 4 Centre of Research Impact and Outcome, Chitkara University, Rajpura-
140417, Punjab, India 5 Associate Professor, Department of
Management Studies, JAIN (Deemed-to-be University), Bengaluru, Karnataka, India 6 Assistant Professor, Department of Design, Vivekananda Global
University, Jaipur, India 7 Department of Information Technology Vishwakarma Institute of
Technology, Pune, Maharashtra, 411037, India
1. INTRODUCTION The recent years have seen the introduction of artificial intelligence (AI) into the educational environment in a rather fast-paced manner, transforming the way learning resources are created, presented, and consumed. The use of AI-generated photobooks is one of the new ways of AI applications that have been promising but not adequately explored in the optimization of visual learning. Photobooks, which are traditionally groups of edited images with a central theme or story, have been used for a long time in the educational process as a means to demonstrate complex information in easily understood and attractive visual format. The possibility of creating individualized, dynamic, and pedagogically oriented photobooks has been broadened significantly with the emergence of sophisticated AI models that can create high quality images with specific contexts with the help of textual prompts. Visual resources are a crucial part of modern classroom teaching that can assist the learner to better understand abstract concepts, establish links among ideas, and memorize information Xu et al. (2024). Cognitive psychology studies indicate that images, used with meaningful text, enhance learning by a dual coding, multimodality processing, and high attention of the learner. Traditional photobooks can however be seen to take a lot of time, skills and monetary resources to create. In their turn, AI-based tools make it possible to create customized visual content within a short time frame and in a cyclic manner, so photobooks could be personalized to the needs of different age groups, learning abilities, topics, and cultural backgrounds. This democratization of content creation enables teachers and institutions to develop educational materials, which were not realistic before because of the resource restrictions Du et al. (2024). Generative AI has also revived the visual storytelling as a pedagogical tool. AI-created photobooks provide a chance to present scientifically precise illustrations, historically inspired scenes, fantastical scenes or culturally-absorbing depictions that would otherwise be absent in traditional repositories. In such areas as science, history and language learning, the provision of on-demand generated image enables educators to complement curriculums with images, which are accurate in terms of instructional objectives Cao et al. (2023). The stepwise process flow of developing photobooks with the assistance of AI is presented in Figure 1. As an example, a photobook where an AI generates images detailing a life cycle of a living being, a historical event reenacted, or a visual depiction of a narrative scene in literature can help learners interact more, and understand better. Figure 1 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Table 1 Summary of AI-Generated Photobooks and Visual Learning Technologies |
|||||
|
Focus Area |
Methodology |
Key Findings |
Benefits |
Impact on Learning |
Future Trends |
|
Dual coding and visual
cognition |
Theoretical analysis |
Visual+verbal inputs enhance
retention |
Supports multimodal learning |
Improved memory and
comprehension |
Increased integration of
visual AI tools |
|
Multimedia learning Lambert and Stevens (2024) |
Experimental studies |
Learners benefit from
well-designed visuals |
Reduces cognitive load |
Better conceptual
understanding |
AI-optimized multimedia
design |
|
Visual storytelling in
education |
Case studies |
Story-driven visuals support
engagement |
Narrative-based learning |
Higher motivation |
AI-generated narratives and
images |
|
Digital visual media Kortemeyer (2023) |
Mixed methods |
Interactive visuals boost
inquiry |
Improved exploration |
Enhanced critical thinking |
Adaptive visual content via
AI |
|
Computer-generated images Sun et al. (2024) |
Experimental |
CGI increases clarity for
abstract topics |
Accuracy in depiction |
Stronger conceptual grasp |
AI-driven science
simulations |
|
EdTech personalization |
Survey research |
Personalized visuals improve
participation |
Tailored learning |
Higher engagement |
Fully personalized AI
photobooks |
|
GAN-based educational
imagery Watts et al. (2023) |
Technical evaluation |
GAN images effective for
diverse contexts |
High-quality visuals |
Better inclusivity |
Bias-reduced generative
models |
|
Digital photobooks in
learning |
Classroom trials |
Photobooks improve reading
motivation |
Strong visual narrative |
Better literacy outcomes |
Hybrid AI-human content
creation |
|
AI for historical
visualization |
Controlled experiment |
AI recreations increase
curiosity |
Realistic representations |
Increased inquiry tasks |
Accurate AI historical
reconstructions |
|
AI in science diagrams DenNy et al. (2023) |
Quantitative |
AI diagrams outperform
manual ones |
Fast generation |
Higher test scores |
Automated curriculum-aligned
visuals |
3. Conceptual Framework
3.1. Theories supporting visual learning
Visual learning is based on the various established cognitive and pedagogical theories that focus on the use of imagery to lead to understanding, retention, and involvement. Among the most powerful models is the Dual Coding Theory by Paivio which states that all human beings perceive the information in two channels: a verbal and a nonverbal channel. The learners form more comprehensive mental images when they are presented with the information both visually and through text, and this is better remembered and understood. In addition to this, the Cognitive Theory of Multimedia Learning emphasized by Mayer supports that learning is better achieved when visuals and text are meaningfully combined, and this alleviates the cognitive load and aids in the active processing of information. Gestalt also forms part of visual learning principles since it describes the way people see patterns, relationships and structures in images. These principles, proximity, similarity and continuity, assist the learners to sense the visual information effectively. Constructivist views also emphasize the relevance of visuals in assisting the learner to develop knowledge as a process of interpreting and relating to the previous experience.
3.2. AI-Driven Content Creation Models
The AIs-based content creation conceptualisers mark the next stage of the educational content conceptualisation, generation, and personalisation. These systems are fuelled by machine learning systems (generative adversarial networks (GANs), diffusion models, transformer-based models) that are able to generate high quality images based on textual descriptions. The work of GANs is based on the competition of the generator and the discriminator networks, pursuing more realistic visual images. Diffusion models, in contrast, progressively reduce noise into consistent images, and permit a high amount of fine-graded control over style, details, and the accuracy of concepts. In the educational setting, these models can be used to develop photobooks fast and based on the needs of the learners. They are able to create images that align with the curriculum objectives, modify images to suit the various levels of proficiency, and to be inclusive by including different cultures, abilities, and settings. The element of timely engineering, i.e. the process of creating accurate textual inputs, is very important in streamlining the quality and relevance of the generated texts. Teachers have the possibility to enhance prompts repeatedly to obtain images that can be used to attain definite educational goals. The development of AI-based content creation is also supported by multimodal representations that have the capacity to analyze and integrate visual, language, and context information. These systems help to create integrated image-textual elements of photobooks which are supposed to support written descriptions.
3.3. Relationship Between AI-Generated Photobooks and Learner Engagement
The ability of AIs to produce photobooks with personalised visuals and narrative format gives AI-generated photobooks a distinctive potential to increase the engagement of learners, making the process more immersive. There are entities of engagement, cognitive, emotional and behavioral, and the visually rich educational materials can positively influence all of them. The AI generated images can turn abstract or complex concepts into concrete ones and decrease cognitive load and improve understanding, which is cognitively possible. When pictures are used to explain meaning and aid in deeper processing, learners will find it easier to continue with the task. On the emotional level, AI-generated photobooks might be curious and inspiring due to the colorful, imaginative, or context-specific images that potentially appeal to the interests and backgrounds of learners. Personalized content enables the educators to include culturally-specific or locally-contextualized images to create the feeling of belonging and personal connection. This emotional appeal frequently works as a boost to intrinsic motivation and long-term focus in the process of learning activities. Photobooks encourage active participation, which involves guided exploration, discussions of images, and interpretation opportunity behaviorally.
4. Methodology
4.1. Research design
The proposed research design is a mixed-methods study because it will investigate the development, application, and educational value of AI-generated photobooks. A balanced methodology, specifically mixed-method, is especially applicable since it combines the advantages of quantitative and qualitative methodologies and enables a more in-depth insight into the effects of AI-generated visual content on the learning process. The quantitative part is devoted to the quantitative measurement of learning outcomes, engagement rates, and user satisfaction with the help of the organized surveys and controlled experiments. These tools develop numerical data that can be used to determine statistically significant trends, associations, and dissimilarities between groups which are subjected to the AI-produced photobooks as well as those which utilize conventional materials. The qualitative element will entail the study of interpretive reactions of learners, perceptions of educators and the photobooks as such to reveal more usability, visual sense, and pedagogical fit. This involves interviews, open-ended survey responses as well as content analysis of photobook samples. Qualitative data also contribute to the study positively by demonstrating contextual factors that influence interactions of learners with AI-generated images.
4.2. Data Collection Methods (Surveys, Experiments, Analysis of Photobooks)
The data collection is done through a combination of several complementary approaches to measure the effectiveness and educational benefit of photobooks created by AI. Students and educators are given questionnaires to fill out, in order to obtain data about usability, comprehensibility, interactivity, and aestheticism. These are surveys with Likert-scale questions and open questions, which give the opportunity to measure quantitatively and get qualitative information. They aid in getting the general trends of various groups of participants. The second significant tool of data collection entails experiments. Regulated training activities are developed where students are engaged with either AI created photobooks or with conventional visual documents. Changes in knowledge, comprehension and retention are measured by pre-tests and post-tests. The levels of engagement are also measured by using behavioral indicators like time spent on tasks, the number of times to refer to pictures, and patterns of asking questions. The third approach is to analytically study the photobooks themselves. This involves assessment of quality of images, coherence in the subject as well as accessibility features and correspondence with curriculum requirements. Visual accuracy, representation diversity and pedagogical appropriateness are measured by the use of rubrics. The workflow of photobook production is also analyzed, such as timely design, choice of tools, and numerous optimal adjustments.
4.3. Participant Selection and Sampling
The selection of the participants is done in a strategically oriented sample which is meant to offer representation at the various levels of education and contexts of learning. The research involves both students and teachers of primary, secondary, and higher education institutions, which makes it possible to compare the functionality of AI-generated photobooks regarding the age group and educational requirements. Purposeful sampling shall be used to select the sample consisting of participants who have different degrees of familiarity with digital learning tools to make the study have a different range of views on usability and engagement. At all levels of education, stratified sampling is applied to form subgroups on the subject areas in science, history, and language learning. Such stratification enables a further analysis of the correspondence of photobooks to the learning outcomes related to disciplines. The teachers are chosen on the basis of experience in teaching, their readiness to use new technologies and their role in curriculum development. Their comments play a role in comprehending the pedagogical possibility and the practical issue of implementing AI-generated photobooks. Institutional announcements and classroom invitations are used in recruiting student participants.
5. Development of AI-Generated Photobooks
5.1. Tools and technologies used (image generation, layout design)
The evolution of AI-created photobooks is based on the integration of both developed image generation systems and layout design platforms with content integration platforms. The creation of images is normally driven by the state of the art AI models including diffusion models, generative adversarial networks (GANs), and transformer-based multimodal models. Such systems encode written prompts as images with visual consistency in conformance to the instructional objectives, which allows educators to develop illustrations that are consistent with the instructional objectives. Diffusion models, especially, are popular due to their high image fidelity, ability to control methods of stylistic feature, and to produce various representations to enhance inclusiveness and relevance to context. Besides image generation software, layout design software is also important. Web apps like Canva, Adobe InDesign, and layouts generators with AI support photographic, textual, and interactive information to be organized in a layout of a structured photobook. Figure 2 identifies the main technological elements that would make it possible to create AI-generated photobooks. These tools provide templates that are customizable, accessibility, and alignment guides that result in visual and pedagogical sense.
Figure 2

Figure 2 Technological Components for Creating AI-Generated
Photobooks
Other AI-based design tools have the ability to create the layout by analyzing the content and suggesting the best layout. Educators can use learning management systems or collaborative digital publishing tools to support embedded multimedia in order to integrate text, metadata and annotations. These platforms result in the smooth distribution, versioning, and cross-platform compatibility.
5.2. Criteria for Educational Photobook Creation
The development of successful AI-generated photobooks must be based on the observance of specific pedagogical, aesthetic and ethical standards. First, there must be content accuracy especially when photobooks are used to represent scientific ideas, historical images, or culturally conscious information. Educators need to validate AI-generated pictures to eliminate the misrepresentations or factual errors. The correspondence to the curriculum standards will guarantee that the visuals help to address a particular learning goal and to promote the major ideas. Second, cognition load and pictorial clarity become the factors that control the choice and organization of images. Photobooks are supposed to display the information in an organized way, clear layout, the typography is supposed to be readable and the color contrast should be effective. The images should also be free of any overload or ornamental details that may lose the focus of the learners on the main materials. Similarity in style and view point can serve to preserve coherence between the pages. Third, it is essential in terms of inclusivity and representation. Photobooks are supposed to represent different cultures, skills, and surroundings so that none of the learners feel neglected. The ability to describe images, simplified images, and labels in other languages contributes to easiness as people with different learning requirements can use them. Fourth, the value of engagement should be taken into account. Photos must cause curiosity, aid in narrative development, and make emotional connection.
5.3. Workflow from Prompt Design to Final Output
The development of AI-generated photobooks proceeds with the thoughtful prompt design, which is a very important step and predetermines the quality and relevance of generated images. Teachers design prompts with key specifications of content, genre, level of difficulty, cultural dimension and the teaching intention. The prompts often need to be refined iteratively since the results might need the refinement in details, accuracy, or visual tone. After prompts are completed, image generating models generate several variations of images. Teachers assess such outputs in terms of accuracy, variety and correspondence to the desired learning outcomes. Chosen images can be slightly edited, i. e. cropped, color-corrected, or even annotated, to increase the clarity and pedagogical appropriateness. The second phase is the arrangement of images in layout designing software. Teachers organize the images and the textual information in a logical order that follows the storylines or development of ideas. Layout software helps to guarantee even the spacing, alignment, and visual hierarchy. At this phase, features of accessibility like alt text, captions, and simplified diagrams can be included. The photobook is also subjected to a review process after layout has been done.
6. Result and Discussion
The findings show that AI-generated photobooks have a strong positive influence on the engagement, understanding, and visual interest of learners at the various educational levels. Students stated that they understood abstract concepts better and teachers appreciated the flexibility and ability to customize it. Tests indicated that retention increased with AI generated visuals to accompany a text in instructions. Nonetheless, some issues were raised about the lack of accuracy in some cases and the necessity to pay attention to the generated content. The challenges practiced in discussions include the significance of balancing creativity with learning rigor and making AI-generated photobooks the most effective ones when carefully selected and consistent with the learning goals.
Table 2
|
Table 2 Comparison of Learning Outcomes Between Control and Experimental Groups |
||
|
Measure |
Control Group (Traditional
Materials) |
Experimental Group
(AI-Generated Photobooks) |
|
Mean Pre-Test Score (%) |
43.4 |
47.1 |
|
Mean Post-Test Score (%) |
61.2 |
80.8 |
|
Retention Score (2-week
follow-up) |
28.7 |
36.4 |
|
Engagement Rating (%) |
61 |
88 |
|
Time on Task (minutes) |
18.6 |
24.9 |
Table 2 gives a clear insight on the learning outcomes between the student that works with traditional materials and those that work with AI-generated photobooks. The findings depict a steady benefit of the experimental group in all the indicators measured. Even though both groups started with equal average pre-test scores, the experimental group had a slightly higher initial score (47.1% and 43.4% respectively) which indicates similar baseline knowledge. Figure 3 presents learning results of the traditional and AI-generated photobook approaches.
Figure 3

Figure 3 Learning Outcome Comparison Between Traditional and
AI-Generated Photobook Methods
The largest difference is observed in the post-test scores with the learners using AI-generated photobooks scoring much higher (80.8%) compared to those who used the traditional resources (61.2%). This significant boost means that visuals created by AI can potentially make the concept and its understanding more efficient.
Figure 4

Figure 4 Performance and Engagement Differences Between
Control and Experimental Groups
The experimental group also has higher retention scores during the two-week follow-up with a better long-term memory of the learned material. Figure 4 presents differences in performance, as well as engagement, between experimental and control groups. The engagement ratings also demonstrate the benefits of AI-generated photobooks, as the experiment group had a higher rate of engagement (88 percent) than the control group (61 percent). This implies that informative content that is visually engaging and contextually oriented creates more interest and involvement to the learner.
7. Conclusion
AI-made photobooks are a groundbreaking experience in the field of educational media as they provide progressive means to incorporate visual learning into the current learning processes. The results of this paper indicate that applied properly, these photobooks are capable of facilitating understanding, encouraging interaction, and addressing special needs of a student. The fact that they provide tailored and context-driven visuals gives educators the opportunity to address the needs of a particular subject, grade level, and cultural environment in a way that makes it more relevant and inclusive. Such flexibility makes AI-created photobooks especially useful in the contemporary and more diverse classrooms. Though it has its advantages, the adoption will have to be implemented thoughtfully. The AI-generated images should be checked regarding their accuracy, clarity, and ethicality to guarantee that learners get quality and culturally sensitive information. In an effort to make the best use of pedagogical value, teachers need to build expertise in timely design, content validation and layout choices. Incorporating these tools in an educator with confidence and responsibility will require the involvement of professional development and institutional support. The research has other implications as well on future education innovation. AI-created photobooks can be used as the example of customized learning content that can help fill in the gaps between visual and written comprehension. Their plastic structure promotes critical thinking, imaginative inquiry, and multimedia interaction, which is in line with the current education systems.
CONFLICT OF INTERESTS
None.
ACKNOWLEDGMENTS
None.
REFERENCES
Cao, Y., Li, S., Liu, Y., Yan, Z., Dai, Y., Yu, P. S., and Sun, L. (2023). A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT. arXiv.
DenNy, P., Khosravi, H., Hellas, A., Leinonen, J., and Sarsa, S. (2023). Can we Trust AI-Generated Educational Content? Comparative Analysis of Human and AI-Generated Learning Resources. arXiv.
Du, H., Zhang, R., Liu, Y., Wang, J., Lin, Y., Li, Z., Niyato, D., Kang, J., Xiong, Z., Cui, S., et al. (2024). Enhancing Deep Reinforcement Learning: A Tutorial on Generative Diffusion Models in Network Optimization. IEEE Communications Surveys and Tutorials, 26, 2611–2646. https://doi.org/10.1109/COMST.2024.3400011
Guo, D., Chen, H., Wu, R., and Wang, Y. (2023). AIGC Challenges and Opportunities Related to Public Safety: A Case Study of ChatGPT. Journal of Safety Science and Resilience, 4, 329–339. https://doi.org/10.1016/j.jnlssr.2023.08.001
Jalil, S., Rafi, S., LaToza, T. D., Moran, K., and Lam, W. (2023). ChatGPT and Software Testing Education: Promises and Perils. In Proceedings of the 2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (4130–4137). https://doi.org/10.1109/ICSTW58534.2023.00078
Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., et al. (2023). ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education. Learning and Individual Differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274
Kortemeyer, G. (2023). Could an Artificial-Intelligence Agent Pass an Introductory Physics Course? Physical Review Physics Education Research, 19, 010132. https://doi.org/10.1103/PhysRevPhysEducRes.19.010132
Lambert, J., and Stevens, M. (2024). ChatGPT and Generative AI Technology: A Mixed Bag of Concerns and New Opportunities. Computers in the Schools, 41, 559–583. https://doi.org/10.1080/07380569.2023.2256710
Lee, D., and Yeo, S. (2022). Developing an Ai-Based Chatbot for Practicing Responsive Teaching in Mathematics. Computers and Education, 191, 104646. https://doi.org/10.1016/j.compedu.2022.104646
Liang, Y., Zou, D., Xie, H., and Wang, F. L. (2023). Exploring the Potential of using ChatGPT in Physics Education. Smart Learning Environments, 10, 52. https://doi.org/10.1186/s40561-023-00273-7
Rudolph, J., Tan, S., and Tan, S. (2023). ChatGPT: Bullshit Spewer or the End of Traditional Assessments in Higher Education? Journal of Applied Learning and Teaching, 6, 342–363. https://doi.org/10.37074/jalt.2023.6.1.9
Sun, Y., Sheng, D., Zhou, Z., and Wu, Y. (2024). AI Hallucination: Towards a Comprehensive Classification of Distorted Information in Artificial Intelligence-Generated Content. Humanities and Social Sciences Communications, 11, 1278. https://doi.org/10.1057/s41599-024-03811-x
Tao, W., Gao, S., and Yuan, Y. (2023). Boundary Crossing: An Experimental Study of Individual Perceptions Toward AIGC. Frontiers in Psychology, 14, 1185880. https://doi.org/10.3389/fpsyg.2023.1185880
Watts, F. M., Dood, A. J., Shultz, G. V., and Rodriguez, J.-M. G. (2023). Comparing Student and Generative Artificial Intelligence Chatbot Responses to Organic Chemistry Writing-To-Learn Assignments. Journal of Chemical Education, 100, 3806–3817. https://doi.org/10.1021/acs.jchemed.3c00664
Wu, T.-T., Lee, H.-Y., Li, P.-H., Huang, C.-N., and Huang, Y.-M. (2024). Promoting Self-Regulation Progress and Knowledge Construction in Blended Learning Via ChatGPT-Based Learning aid. Journal of Educational Computing Research, 61, 3–31. https://doi.org/10.1177/07356331231191125
Xu, M., Du, H., Niyato, D., Kang, J., Xiong, Z., Mao, S., Han, Z., Jamalipour, A., Kim, D. I., Shen, X., et al. (2024). Unleashing the Power of Edge–Cloud Generative AI in Mobile Networks: A Survey of AIGC Services. IEEE Communications Surveys and Tutorials, 26, 1127–1170. https://doi.org/10.1109/COMST.2024.3353265
Yang, S., Yang, S., and Tong, C. (2023). In-Depth Application of Artificial Intelligence-Generated Content (AIGC) Large Models in Higher Education. Adult and Higher Education, 5, 9–16. https://doi.org/10.23977/aduhe.2023.051902
|
|
This work is licensed under a: Creative Commons Attribution 4.0 International License
© ShodhKosh 2025. All Rights Reserved.